{ "architectures": [ "BertForSequenceClassification" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 128, "id2label": { "0": "PF17802\nPrealbumin-like fold domain\nThis entry contains a prealbumin-like domain from a wide variety of bacterial surface proteins. This entry corresponds to domain 1 and domain 3 of SpaA from Corynebacterium diphtheriae [PMID:19805181]. Some members of this family contain an isopeptide bond.", "1": "PF05013\nN-formylglutamate amidohydrolase\nFormylglutamate amidohydrolase (FGase) catalyses the terminal reaction in the five-step pathway for histidine utilisation in Pseudomonas putida. By this action, N-formyl-L-glutamate (FG) is hydrolysed to produce L-glutamate plus formate [PMID:3308850].", "2": "PF18821\nLarge polyvalent protein-associated domain 7\nThis domain contains conserved aspartate and phenylalanine residues. It is widely present in polyvalent proteins and gene neighbourhoods of conjugative elements [PMID:28559295]. This domain is also known as PTox1 [PMID:25358815].", "3": "PF18075\nFtsX extracellular domain\nThis is the extracellular domain (ECD) found in FtsX enzyme, a homolog of the transmembrane PG-hydrolase regulator. The FtsX extracellular domain binds the PG peptidase Rv2190c/RipC N-terminal segment, causing a conformational change that activates the enzyme ileading to PG hydrolysis in Mycobacterium tuberculosis. Structural analysis of FtsX ECD reveals fold containing two lobes connected by a flexible hinge. Mutations in the hydrophobic cleft between the lobes showed reduction in RipC binding in vitro and inhibition of FtsX function in Mycobacterium smegmatis [PMID:24843173].", "4": "PF04039\nDomain related to MnhB subunit of Na+/H+ antiporter\nPossible subunit of Na+/H+ antiporter [PMID:9852009], [PMID:11356194]. Predicted integral membrane protein, usually four transmembrane regions in this domain. Often found in bacterial NADH dehydrogenase subunit.", "5": "PF06130\nPhosphate propanoyltransferase\nThis family includes phosphotransacylases (PTACs) required for the degradation of 1,2-propanediol (1,2-PD) [PMID:17158662].", "6": "PF02654\nCobalamin-5-phosphate synthase\nThis is family of Colbalmin-5-phosphate synthases, CobS, from bacteria. The CobS enzyme catalyses the synthesis of AdoCbl-5'-p from AdoCbi-GDP and alpha-ribazole-5'-P [PMID:10518530]. This enzyme is involved in the cobalamin (vitamin B12) biosynthesis pathway in particular the nucleotide loop assembly stage in conjunction with CobC, CobU and CobT [PMID:10518530].", "7": "PF00163\nRibosomal protein S4/S9 N-terminal domain\nThis family includes small ribosomal subunit S9 from prokaryotes and S16 from metazoans. This domain is predicted to bind to ribosomal RNA [PMID:9707415]. This domain is composed of four helices in the known structure. However the domain is discontinuous in sequence and the alignment for this family contains only the first three helices.", "8": "Inorganic pyrophosphatase", "9": "PF06165\nGlycosyltransferase family 36\nThe glycosyltransferase family 36 includes cellobiose phosphorylase (EC:2.4.1.20), cellodextrin phosphorylase (EC:2.4.1.49), chitobiose phosphorylase (EC:2.4.1.-). Many members of this family contain two copies of this domain.", "10": "PF17921\nIntegrase zinc binding domain\nThis zinc binding domain is found in a wide variety of integrase proteins.", "11": "MerR family regulatory protein", "12": "PF02466\nTim17/Tim22/Tim23/Pmp24 family\nThe pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17 [PMID:8893850, PMID:10366717, PMID:7721939]. Tim17, Tim22, and Tim23, are the central components of the widely conserved multi-subunit protein translocases, TIM23 and TIM22, which mediate protein transport across and into the inner mitochondrial membrane, respectively [PMID:27760563]. In addition, several Tim17 family proteins occupy the inner and outer membranes of plastids. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 [PMID:7721939]. Family members are suggested to be exclusive to eukaryotes, where the distribution in the eukaryotic subgroups of the mitochondrial Tim17, Tim22 and Tim23 proteins, as well as the peroxisomal Tim17 family proteins, suggests that they all likely to be present in the last eukaryotic common ancestor (LECA) [PMID:27760563].", "13": "PF00237\nRibosomal protein L22p/L17e\nThis family includes L22 from prokaryotes and chloroplasts and L17 from eukaryotes.", "14": "PF05649\nPeptidase family M13\nM13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk. ", "15": "PF06094\nGamma-glutamyl cyclotransferase, AIG2-like\nGGACT, gamma-glutamylamine cyclotransferase, is a ubiquitous enzyme found in bacteria, plants, and metazoans from Dictyostelium through to humans. It converts gamma-glutamylamines to free amines and 5-oxoproline.", "16": "PF01369\nSec7 domain\nThe Sec7 domain is a guanine-nucleotide-exchange-factor (GEF) for the Pfam:PF00025 family [PMID:8945478].", "17": "PF01649\nRibosomal protein S20\nBacterial ribosomal protein S20 interacts with 16S rRNA [PMID:3373529].", "18": "PF01975\nSurvival protein SurE\nE. coli cells with the surE gene disrupted are found to survive poorly in stationary phase [PMID:7928962]. It is suggested that SurE may be involved in stress response. Yeast also contains a member of the family Swiss:P38254. Swiss:P30887 can complement a mutation in acid phosphatase, suggesting that members of this family could be phosphatases.", "19": "PF07503\nHypF finger\nThe HypF family of proteins are involved in the maturation and regulation of hydrogenase ([PMID:9492269]). In the N-terminus they appear to have two Zinc finger domains, as modelled by this family.", "20": "PF00731\nAIR carboxylase\nMembers of this family catalyse the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain.", "21": "PF03334\nNa+/H+ antiporter subunit\nThis family includes PhaG from Rhizobium meliloti Swiss:Q9ZNG0, MnhG from Staphylococcus aureus Swiss:Q9ZNG0, YufB from Bacillus subtilis Swiss:O05227.", "22": "PF01985\nCRS1 / YhbY (CRM) domain\nEscherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants [PMID:17105995]. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing [PMID:18065687]. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes [PMID:17105995]. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome [PMID:12429100][PMID:12360533].", "23": "PF11760\nCobalamin synthesis G N-terminal\nMembers of this family are involved in cobalamin synthesis. The gene encoded by Swiss:P72862 has been designated cbiH but in fact represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyse adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyses a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process [PMID:9742225]. Within the cobalamin synthesis pathway CbiG catalyses the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B [PMID:16866557]. The N-terminal of the enzyme is conserved in this family, and the C-terminal and the mid-sections are conserved independently in other families, CbiG_C and CbiG_mid, although the distinct function of each region is unclear.", "24": "Formate--tetrahydrofolate ligase", "25": "PF04055\nRadical SAM superfamily\nRadical SAM proteins catalyse diverse reactions, including unusual methylations, isomerisation, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation.", "26": "PF08712\nScaffold protein Nfu/NifU N terminal\nThis domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters [PMID:12886008][PMID:14993221].", "27": "3'5'-cyclic nucleotide phosphodiesterase", "28": "PF02537\nCrcB-like protein, Camphor Resistance (CrcB)\nCRCB is a family of bacterial integral membrane proteins with four TMs.. Over expression in E. coli also leads to camphor resistance [PMID:8844142, PMID:12904550].", "29": "PF04122\nell wall binding domain 2 (CWB2)\nThis domain is found in 1 to 3 tandem copies in a wide variety of bacterial cell surface proteins. It has been show the three tandem repeats of the CWB2 domain are essential for correct anchoring to the cell wall [PMID:25649385]. It was shown that in SlpA and Cwp2 that these domains were essential for the binding of PSII an anionic teichoic acid-like component of the cell wall [PMID:25649385]. The structure of the Cwp8 and Cwp6 proteins shows that this domain forms a trimeric arrangement with each domain adopting a structure with some similarity to the Toprim fold [PMID:28132783]. A groove containing many conserved residues was predicted to be the site of the PSII molecule [PMID:28132783].", "30": "PF03619\nOrganic solute transporter Ostalpha\nThis family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters [PMID:15563450]. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death [PMID:20830211].", "31": "PF08148\nDSHCT (NUC185) domain\nThis C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases [PMID:15112237].", "32": "5'-3' exonuclease, N-terminal resolvase-like domain", "33": "Bacillus/Clostridium GerA spore germination protein", "34": "PF04237\nYjbR\nYjbR has a CyaY-like fold [PMID:23044854].", "35": "PF10436\nMitochondrial branched-chain alpha-ketoacid dehydrogenase kinase\nCatabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. This is the butyryl-CoA dehydrogenase, subunit A domain 3, a largely alpha-helical bundle of the enzyme BCDHK. This enzyme is the regulator of the dehydrogenase complex that breaks branched-chain amino-acids down, by phosphorylating and thereby inactivating it when synthesis is required. The domain is associated with family HATPase_c Pfam:PF02518 which is towards the C-terminal.", "36": "PF06421\nGTP-binding protein LepA C-terminus\nThis family consists of the C-terminal region of several pro- and eukaryotic GTP-binding LepA proteins [PMID:11489118].", "37": "PF04253\nTransferrin receptor-like dimerisation domain\nThis domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure.", "38": "PF01657\nSalt stress response/antifungal\nThis domain is often found in association with the kinase domains Pfam:PF00069 or Pfam:PF07714. In many proteins it is duplicated. It contains six conserved cysteines which are involved in disulphide bridges [PMID:19603485]. It has a role in salt stress response [PMID:19036832] and has antifungal activity [PMID:17338634].", "39": "PF02545\nMaf-like protein\nMaf is a putative inhibitor of septum formation [PMID:8387996] in eukaryotes, bacteria, and archaea.", "40": "PF02811\nPHP domain\nThe PHP (Polymerase and Histidinol Phosphatase) domain is a putative phosphoesterase domain.", "41": "Granulin", "42": "PF00557\nMetallopeptidase family M24\nThis family contains metallopeptidases. It also contains non-peptidase homologues such as the N terminal domain of Spt16 which is a histone H3-H4 binding module [PMID:18579787].", "43": "PF07884\nVitamin K epoxide reductase family\nVitamin K epoxide reductase (VKOR) recycles reduced vitamin K, which is used subsequently as a co-factor in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. VKORC1 is a member of a large family of predicted enzymes that are present in vertebrates, Drosophila, plants, bacteria and archaea [PMID:15276181]. Four cysteine residues and one residue, which is either serine or threonine, are identified as likely active-site residues [PMID:15276181]. In some plant and bacterial homologues the VKORC1 homologous domain is fused with domains of the thioredoxin family of oxidoreductases [PMID:15276181].", "44": "Ribosomal protein S15", "45": "PF03862\nSpoVAC/SpoVAEB sporulation membrane protein\nMembers of this family are all transcribed from the spoVA operon [PMID:11751839]. Bacillus and Clostridium are two well studied endospore forming bacteria. Spore formation provides a resistance mechanism in response to extreme or unfavourable environmental conditions such as heat, radiation, and chemical agents or nutrient deprivation. The reverse process termed germination takes place where spores develop into growing cells in response to nutrient availability or stress reduction. Nutrient germinant receptors (GRs) and the SpoVA proteins are important players in the germination process. In B.subtilis the SpoVAC and SpoVAEB, belonging to this domain family, are predicted to be membrane proteins, with two to five membrane spanning. Biophysical and biochemical studies suggest that SpoVAC acts as a mechano-sensitive channel with properties that would allow the release of Ca-DPA (dipicolinic acid) and amino acids during germination of the spore. The release of Ca-DPA is a crucial event during spore germination. When expressed in E.coli SpoVAC provides protection against osmotic downshift. Furthermore, SpoVAC acts as channel that facilitates the efflux down the concentration gradient of osmolytes up to a mass of at least 600 Da [PMID:24666282]. Another conserved SpoVA protein in all spore-forming bacteria is SpoVAEb, which appears to be an integral membrane protein with no known function [PMID:27044622].", "46": "CheB methylesterase", "47": "PF17757\nUvrB interaction domain\nThis domain is found in the UvrB protein where it interacts with the UvrA protein [PMID:19287003].", "48": "PF09269\nDomain of unknown function (DUF1967)\nMembers of this family contain a four-stranded beta sheet and three alpha helices flanked by an additional beta strand. They are predominantly found in the bacterial GTP-binding protein Obg, and are still functionally uncharacterised [PMID:15019792].", "49": "PF03449\nTranscription elongation factor, N-terminal\nThis domain adopts a long alpha-hairpin structure.", "50": "PF02686\nGlu-tRNAGln amidotransferase C subunit\nThis is a family of Glu-tRNAGln amidotransferase C subunits. The Glu-tRNA Gln amidotransferase enzyme itself is an important translational fidelity mechanism replacing incorrectly charged Glu-tRNAGln with the correct Gln-tRANGln via transmidation of the misacylated Glu-tRNAGln [PMID:9342321]. This activity supplements the lack of glutaminyl-tRNA synthetase activity in gram-positive eubacterteria, cyanobacteria, Archaea, and organelles [PMID:9342321].", "51": "PF02777\nIron/manganese superoxide dismutases, C-terminal domain\nsuperoxide dismutases (SODs) catalyse the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. C-terminal domain is a mixed alpha/beta fold.", "52": "PF01867\nCRISPR associated protein Cas1\nClustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. This family of proteins corresponds to Cas1, a CRISPR-associated protein. Cas1 may be involved in linking DNA segments to CRISPR [PMID:16079334].", "53": "Single cache domain 3", "54": "PF01193\nRNA polymerase Rpb3/Rpb11 dimerisation domain\nThe two eukaryotic subunits Rpb3 and Rpb11 dimerise to from a platform onto which the other subunits of the RNA polymerase assemble (D/L in archaea). The prokaryotic equivalent of the Rpb3/Rpb11 platform is the alpha-alpha dimer. The dimerisation domain of the alpha subunit/Rpb3 is interrupted by an insert domain (Pfam:PF01000). Some of the alpha subunits also contain iron-sulphur binding domains (Pfam:PF00037). Rpb11 is found as a continuous domain. Members of this family include: alpha subunit from eubacteria, alpha subunits from chloroplasts, Rpb3 subunits from eukaryotes, Rpb11 subunits from eukaryotes, RpoD subunits from archaeal spp, and RpoL subunits from archaeal spp.", "55": "PF01904\nProtein of unknown function DUF72\nThe function of this family is unknown.", "56": "PF08486\nStage II sporulation protein\nThis domain is found in the stage II sporulation protein SpoIID. SpoIID is necessary for membrane migration as well as for some of the earlier steps in engulfment during bacterial endospore formation [PMID:12502745]. The domain is also found in amidase enhancer proteins. Amidases, like SpoIID, are cell wall hydrolases [PMID:10961456].", "57": "PF06965\nNa+/H+ antiporter 1\nThis family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyse the exchange of H+ for Na+ in a manner that is highly dependent on the pH [PMID:1657980].", "58": "PF14805\nTetrahydrodipicolinate N-succinyltransferase N-terminal\nThis is the N-terminal domain of 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase [PMID:9671504].", "59": "PF16124\nRecQ zinc-binding\nThis domain is the zinc-binding domain of ATP-dependent DNA helicase RecQ [1-2].", "60": "PF14403\nCircularly permuted ATP-grasp type 2\nCircularly permuted ATP-grasp prototyped by Roseiflexus RoseRS_2616 that is associated in gene neighborhoods with a GCS2-like COOH-NH2 ligase, alpha/beta hydrolase fold peptidase, GAT-II -like amidohydrolase, and M20 peptidase. Members of this family are predicted to be involved in the biosynthesis of small peptides [PMID:20023723].", "61": "SRP54-type protein, helical bundle domain", "62": "Homeodomain-like domain", "63": "PF02625\nXdhC and CoxI family\nThis domain is often found in association with an NAD-binding region, related to TrkA-N (Pfam:PF02254; personal obs:C. Yeats). XdhC is believed to be involved in the attachment of molybdenum to Xanthine Dehydrogenase ([PMID:10217763]).", "64": "PF02436\nConserved carboxylase domain\nThis domain represents a conserved region in pyruvate carboxylase (PYC), oxaloacetate decarboxylase alpha chain (OADA), and transcarboxylase 5s subunit. The domain is found adjacent to the HMGL-like domain (Pfam:PF00682) and often close to the biotin_lipoyl domain (Pfam:PF00364) of biotin requiring enzymes. ", "65": "PF13793\nN-terminal domain of ribose phosphate pyrophosphokinase\nThis family is frequently found N-terminal to the Pribosyltran, Pfam:PF00156.", "66": "PF08393\nDynein heavy chain, N-terminal region 2\nDyneins are described as motor proteins of eukaryotic cells, as they can convert energy derived from the hydrolysis of ATP to force and movement along cytoskeletal polymers, such as microtubules. This region is found C-terminal to the dynein heavy chain N-terminal region 1 (Pfam:PF08385) in many members of this family. No functions seem to have been attributed specifically to this region.", "67": "Disintegrin", "68": "PF07804\nHipA-like C-terminal domain\nThe members of this family are similar to a region close to the C-terminus of the HipA protein expressed by various bacterial species (for example Swiss:P23874). This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis [PMID:1715862]. When expressed alone, it is toxic to bacterial cells [PMID:1715862], but it is usually tightly associated with HipB [PMID:8021189], and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products [PMID:8021189].", "69": "PF02383\nSacI homology domain\nThis Pfam family represents a protein domain which shows homology to the yeast protein SacI Swiss:P32368. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin.", "70": "Bacterial dnaA protein helix-turn-helix", "71": "PF06580\nHistidine kinase\nThis family represents a region within bacterial histidine kinase enzymes. Two-component signal transduction systems such as those mediated by histidine kinase are integral parts of bacterial cellular regulatory processes, and are used to regulate the expression of genes involved in virulence [PMID:12462127]. Members of this family often contain Pfam:PF02518 and/or Pfam:PF00672.", "72": "PF00484\nCarbonic anhydrase\nThis family includes carbonic anhydrases as well as a family of non-functional homologues related to YbcF.", "73": "PF02152\nDihydroneopterin aldolase\nThis enzyme EC:4.1.2.25 catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate.", "74": "PF16353\nBeta-galactosidase, domain 4\nThis entry represents domain 4 found in beta-galactosidase [PMID:8008071] and it is organised in a jelly-roll type barrel (Rutkiewicz-Krotewicz M. et al. Crystals 2018, 8(1), 13, https://doi.org/10.3390/cryst8010013).", "75": "PF04892\nVanZ like family\nThis family contains several examples of the VanZ protein, but also contains examples of phosphotransbutyrylases [PMID:7867956].", "76": "HTH domain found in ParB protein", "77": "PF06724\nDomain of Unknown Function (DUF1206)\nThis region consists of two a pair of transmembrane helices and occurs three times in each of the family member proteins.", "78": "PF17853\nGGDEF-like domain\nThis domain is distantly related to the GGDEF domain, suggesting these may by diguanylate cyclase enzymes.", "79": "Tetratricopeptide repeat", "80": "PF00278\nPyridoxal-dependent decarboxylase, C-terminal sheet domain\nThese pyridoxal-dependent decarboxylases act on ornithine, lysine, arginine and related substrates. ", "81": "PF14508\nGlycosyl-hydrolase 97 N-terminal\nThis N-terminal domain of glycosyl-hydrolase-97 [PMID:16131397]contributes part of the active site pocket. It is also important for contact with the catalytic and C-terminal domains of the whole [PMID:18848471, PMID:18981178].", "82": "PF02817\ne3 binding domain\nThis family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit.", "83": "PF06719\nAraC-type transcriptional regulator N-terminus\nThis family represents the N-terminus of bacterial ARAC-type transcriptional regulators. In E. coli, these regulate the L-arabinose operon through sensing the presence of arabinose, and when the sugar is present, transmitting this information from the arabinose-binding domains to the protein's DNA-binding domains [PMID:12683999]. This family might represent the N-terminal arm of the protein, which binds to the C-terminal DNA binding domains to hold them in a state where the protein prefers to loop and remain non-activating [PMID:9600837]. All family members contain the Pfam:PF00165 domain.", "84": "Animal haem peroxidase", "85": "PF02563\nPolysaccharide biosynthesis/export protein\nThis is a family of periplasmic proteins involved in polysaccharide biosynthesis and/or export.", "86": "PF04519\nPolymer-forming cytoskeletal\nThis is a family of bactofilins, a functionally diverse class of cytoskeletal, polymer-forming, proteins that is widely conserved among bacteria. In the example species C. crescentus, two bactofilins assemble into a membrane-associated laminar structure that shows cell-cycle-dependent polar localisation and acts as a platform for the recruitment of a cell wall biosynthetic enzyme involved in polar morphogenesis. Bactofilins display distinct subcellular distributions and dynamics in different bacterial species, suggesting that they are versatile structural elements that have adopted a range of different cellular functions.", "87": "PF05383\nLa domain\nThis presumed domain is found at the N-terminus of La RNA-binding proteins as well as other proteins [PMID:8035818]. The function of this region is uncertain.", "88": "Translation initiation factor SUI1", "89": "PF15901\nSortilin, neurotensin receptor 3, C-terminal\nSortilin_C is the C-terminal cytoplasmic tail of sortilin, a Vps10p domain-containing family of proteins [PMID:9013611, PMID:9756851]. Most sortilin is expressed within intracellular compartments, where it chaperones diverse ligands, including proBDNF and acid hydrolases. The sortilin cytoplasmic tail is homologous to mannose 6-phosphate receptor and is required for the intracellular trafficking of cargo proteins via interactions with distinct adaptor molecules [PMID:10085125, PMID:11331584]. In addition to mediating lysosomal targeting of specific acid hydrolases, the sortilin cytoplasmic tail also directs trafficking of BDNF to the secretory pathway in neurons, where it can be released in response to depolarisation to modulate cell survival and synaptic plasticity [PMID:19122660].", "90": "PF04675\nDNA ligase N terminus\nThis region is found in many but not all ATP-dependent DNA ligase enzymes (EC:6.5.1.1). It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I (Swiss:P18858), and in Saccharomyces cerevisiae (Swiss:P04819), this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In vaccinia virus (Swiss:P16272) this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation [PMID:9016621].", "91": "Delta-aminolevulinic acid dehydratase", "92": "PF04316\nAnti-sigma-28 factor, FlgM\nFlgM binds and inhibits the activity of the transcription factor sigma 28. Inhibition of sigma 28 prevents the expression of genes from flagellar transcriptional class 3, which include genes for the filament and chemotaxis. Correctly assembled basal body-hook structures export FlgM, relieving inhibition of sigma 28 and allowing expression of class 3 genes. NMR studies show that free FlgM is mostly unfolded, which may facilitate its export. The C terminal half of FlgM adopts a tertiary structure when it binds to sigma 28. All mutations in FlgM that prevent sigma 28 inhibition affect the C-terminal domain and is the region thought to constitute the binding domain. A minimal binding domain has been identified between Glu 64 and Arg 88 in Salmonella typhimurium (Swiss:P26477). The N-terminal portion remains unstructured and may be necessary for recognition by the export machinery [PMID:9095196].", "93": "PTS system mannose/fructose/sorbose family IID component", "94": "PF12399\nBranched-chain amino acid ATP-binding cassette transporter\nThis domain family is found in bacteria, archaea and eukaryotes, and is approximately 30 amino acids in length. The family is found in association with Pfam:PF00005. There is a conserved AYLG sequence motif. This family is the C terminal of an ATP dependent branched-chain amino acid transporter [PMID:2195019]. This domain is essential for LPS transport, through critical interactions with Walker A and switch helix domains [PMID:31431556].", "95": "Ribosomal protein S6", "96": "PF16123\nHydroxyacylglutathione hydrolase C-terminus\nThis domain is found at the C-terminus of hydroxyacylglutathione hydrolase enzymes. Substrate binding occurs at the interface between this domain and the catalytic domain (Pfam:PF00753) [1-3].", "97": "PF01484\nNematode cuticle collagen N-terminal domain\nThe function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens, see Pfam:PF01391. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins [PMID:7828882].", "98": "PF02547\nQueuosine biosynthesis protein\nQueuosine (Q) biosynthesis protein, or S-adenosylmethionine:tRNA -ribosyltransferase-isomerase, is required for the synthesis of the queuosine precursor (oQ). It catalyses the transfer and isomerisation of the ribose moiety from AdoMet to the 7-aminomethyl group of 7-deazaguanine (preQ1-tRNA) to form epoxyqueuosine (oQ-tRNA). Q is a hypermodified nucleoside usually found at the first position of the anticodon of asparagine, aspartate, histidine, and tyrosine tRNAs [PMID:8347586, PMID:12731872]. In Streptococcus gordonii , QueA has been shown to play a role in the regulation of arginine deiminase genes [PMID:18552185].", "99": "Glutamyl-tRNAGlu reductase, N-terminal domain", "100": "PF02632\nBioY family\nA number of bacterial genes are involved in bioconversion of pimelate into dethiobiotin [PMID:2110099]. BioY is a component of the BioMNY transport system involved in biotin uptake in prokaryotes [PMID:17301237].", "101": "PF01226\nFormate/nitrite transporter\nProteins in this entry belong to the Formate-Nitrite Transporter (FNT) family and includes the nitrite transport protein NirC and formate channel FocA [PMID:22407320, PMID:]. They have a pentameric architecture with structural similarity to aquaporins and glyceroporins [PMID:22407320]. Proteins in this family transport the structurally related compounds, formate and nitrite.", "102": "Oxysterol-binding protein", "103": "PF08338\nDomain of unknown function (DUF1731)\nThis domain of unknown function appears towards the C-terminus of proteins of the NAD dependent epimerase/dehydratase family (Pfam:PF01370) in bacteria, eukaryotes and archaea. Many of the proteins in which it is found are involved in cell-division inhibition.", "104": "PF01368\nDHH family\nIt is predicted that this family of proteins all perform a phosphoesterase function. It included the single stranded DNA exonuclease RecJ.", "105": "PF17912\nMalK OB fold domain\nThis entry corresponds to one of two OB-fold domains found in the MalK transport protein.", "106": "PF00988\nCarbamoyl-phosphate synthase small chain, CPSase domain\nThe carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines [PMID:1972379]. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See Pfam:PF00289. The small chain has a GATase domain in the carboxyl terminus. See Pfam:PF00117.", "107": "PF04402\nProtein of unknown function (DUF541)\nMembers of this family have so far been found in bacteria and mouse SwissProt or TrEMBL entries. However possible family members have also been identified in translated rat (Genbank:AW144450) and human (Genbank:AI478629) ESTs. A mouse family member has been named SIMPL (signalling molecule that associates with mouse pelle-like kinase). SIMPL appears to facilitate and/or regulate complex formation between IRAK/mPLK (IL-1 receptor-associated kinase) and IKK (inhibitor of kappa-B kinase) containing complexes, and thus regulate NF-kappa-B activity [PMID:11096118]. Separate experiments demonstrate that a mouse family member (named LaXp180) binds the Listeria monocytogenes surface protein ActA, which is a virulence factor that induces actin polymerisation. It may also bind stathmin, a protein involved in signal transduction and in the regulation of microtubule dynamics [PMID:11207567]. In bacteria its function is unknown, but it is thought to be located in the periplasm or outer membrane.", "108": "PF10431\nC-terminal, D2-small domain, of ClpB protein\nThis is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, Pfam:PF00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly [PMID:14567920]. The domain is associated with two Clp_N, Pfam:PF02861, at the N-terminus as well as AAA, Pfam:PF00004 and AAA_2, Pfam:PF07724.", "109": "PF02953\nTim10/DDP family zinc finger\nPutative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein Swiss:O60220. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import [PMID:11101512]. Members of this family seem to be localised to the mitochondrial intermembrane space [PMID:8663351].", "110": "RTX calcium-binding nonapeptide repeat (4 copies)", "111": "PF13732\nDomain of unknown function (DUF4162)\nThis domain is found at the C-terminus of bacterial ABC transporter proteins. The function is not known.", "112": "PF01386\nRibosomal L25p family\nRibosomal protein L25 is an RNA binding protein, that binds 5S rRNA. This family includes Ctc from B. subtilis Swiss:P14194, which is induced by stress.", "113": "PTS HPr component phosphorylation site", "114": "Ribosomal protein L31", "115": "PF02887\nPyruvate kinase, alpha/beta domain\nAs well as being found in pyruvate kinase this family is found as an isolated domain in some bacterial proteins.", "116": "Enolase, N-terminal domain", "117": "PF03733\nInner membrane component domain\nDomain occurs as one or more copies in bacterial and eukaryotic proteins. These are membrane proteins of four TM regions, two appearing in each of the two copies when both are present. Many of the latter members also carry the sodium/calcium exchanger protein family Pfam:PF01699, which have multipass membrane regions.", "118": "PF00766\nElectron transfer flavoprotein FAD-binding domain\nThis domain found at the C-terminus of electron transfer flavoprotein alpha chain and binds to FAD [PMID:8962055]. The fold consists of a five-stranded parallel beta sheet as the core of the domain, flanked by alternating helices. A small part of this domain is donated by the beta chain [PMID:8962055].", "119": "PF04296\nProtein of unknown function (DUF448)\nThe YlxR family has been demonstrated to regulate metabolic gene expression [PMID:30355672].", "120": "Amiloride-sensitive sodium channel", "121": "PF04371\nPorphyromonas-type peptidyl-arginine deiminase\nPeptidyl-arginine deiminase (PAD) enzymes catalyse the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (PPAD) appears to be evolutionarily unrelated to mammalian PAD (Pfam:PF03068), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologues) [PMID:11504612]. The predicted catalytic residues in PPAD (Swiss:Q9RQJ2) are Asp130, Asp187, His236, Asp238 and Cys351 [PMID:11504612]. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyse the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor [PMID:10377098].", "122": "PF14905\nOuter membrane protein beta-barrel family\nThis family includes proteins annotated as TonB dependent receptors. But it is also likely to contain other membrane beta barrel proteins of other functions.", "123": "[2Fe-2S] binding domain", "124": "PF03457\nHelicase associated domain\nThis short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.", "125": "Zona pellucida-like domain", "126": "Fructose-bisphosphate aldolase class-II", "127": "PF04367\nProtein of unknown function (DUF502)\nPredicted to be an integral membrane protein.", "128": "PF03015\nMale sterility protein\nThis family represents the C-terminal region of the male sterility protein in a number of arabidopsis and drosophila. A sequence-related jojoba acyl CoA reductase is also included.", "129": "PF04205\nFMN-binding domain\nThis conserved region includes the FMN-binding site of the NqrC protein [PMID:11248234] as well as the NosR and NirI regulatory proteins. This domain is post-translationally flavinylated that may facilitate electron transfer, and thus, resembles multiheme cytochromes [PMID:34032212].", "130": "Ribosomal protein L9, N-terminal domain", "131": "PF14450\nCell division protein FtsA\nFtsA is essential for bacterial cell division, and co-localises to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains [PMID:9352931]. The FtsA protein contains two structurally related actin-like ATPase domains which are also structurally related to the ATPase domains of HSP70 (see PF00012). FtsA has a SHS2 domain PF02491 inserted in to the RnaseH fold PF02491 [PMID:15281131].", "132": "Adenylosuccinate synthetase", "133": "AAA domain", "134": "PF03116\nNQR2, RnfD, RnfE family\nThis family of bacterial proteins includes a sodium-translocating NADH-ubiquinone oxidoreductase (i.e. a respiration linked sodium pump). In Vibrio cholerae, it negatively regulates the expression of virulence factors through inhibiting (by an unknown mechanism) the transcription of the transcriptional activator ToxT [PMID:10077658]. The family also includes proteins involved in nitrogen fixation, RnfD and RnfE. The similarity of these proteins to NADH-ubiquinone oxidoreductases was previously noted [PMID:9154934].", "135": "PF05960\nBacterial protein of unknown function (DUF885)\nThis family consists of several hypothetical bacterial proteins several of which are putative membrane proteins.", "136": "Ribosomal protein L5", "137": "PF03776\nSeptum formation topological specificity factor MinE\nThe E. coli minicell locus was shown to code for three gene products (MinC, MinD, and MinE) whose coordinate action is required for proper placement of the division septum. The minE gene codes for a topological specificity factor that, in wild-type cells, prevents the division inhibitor from acting at internal division sites while permitting it to block septation at polar sites [PMID:2645057].", "138": "PF05226\nCHASE2 domain\nCHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time [PMID:12486065].", "139": "PF17871\nAAA lid domain\nThis entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.", "140": "PF00687\nRibosomal protein L1p/L10e family\nThis family includes prokaryotic L1 and eukaryotic L10.", "141": "PF06245\nProtein of unknown function (DUF1015)\nFamily of proteins with unknown function found in archaea and bacteria.", "142": "PF02646\nRmuC family\nThis family contains several bacterial RmuC DNA recombination proteins. The function of the RMUC protein is unknown but it is suspected that it is either a structural protein that protects DNA against nuclease action, or is itself involved in DNA cleavage at the regions of DNA secondary structures [PMID:10886369]", "143": "PF16863\nN-terminal barrel of NtMGAM and CtMGAM, maltase-glucoamylase\nNtCtMGAM_N is a beta-barrel-like structure just N-terminal to the catalytic domain of maltase-glucoamylase in eukaryotes. It contributes to the architecture of the substrate-binding site, by donating a loop that comes into close contact with two regions in the catalytic domain thereby creating the site [PMID:18036614]. This family is frequently found at the N-terminus of Glycosyl hydrolase 31, Pfam:PF01055.to which it contributes as above.", "144": "PF02583\nMetal-sensitive transcriptional repressor\nThis is a family of metal-sensitive repressors, involved in resistance to metal ions. Members of this family bind copper, nickel or cobalt ions via conserved cysteine and histidine residues. In the absence of metal ions, these proteins bind to promoter regions and repress transcription. When bound to metal ions they are unable to bind DNA, leading to transcriptional derepression [1-5].", "145": "PF01556\nDnaJ C terminal domain\nThis family consists of the C terminal region of the DnaJ protein. It is always found associated with Pfam:PF00226 and Pfam:PF00684. DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. The two C-terminal domains CTDI and CTDII, both incorporated in this family are necessary for maintaining the J-domains in their specific relative positions [PMID:22011374]. Structural analysis of PDB:1nlt shows that PF00684 is nested within this DnaJ C-terminal region [PMID:14656432].", "146": "PF06686\nStage III sporulation protein AC/AD protein family\nThis family consists of several bacterial stage III sporulation protein AC (SpoIIIAC) and SpoIIIAD sequences. The exact function of this family is unknown. SpoIIIAD is the an uncharacterised protein which is part of the spoIIIA operon that acts at sporulation stage III as part of a cascade of events leading to endospore formation. The operon is regulated by sigmaG [PMID:8969508].", "147": "PF01035\n6-O-methylguanine DNA methyltransferase, DNA binding domain\nThis is the C-terminal DNA-binding domain of 6-O-methylguanine-DNA methyltransferases [PMID:8156986]. ", "148": "PF02618\nYceG-like family\nThis family of proteins is found in bacteria. Proteins in this family are typically between 332 and 389 amino acids in length. This family was previously incorrectly annotated and names as aminodeoxychorismate lyase. The structure of Swiss:P28306 was solved by X-ray crystallography.", "149": "PF13517\nFG-GAP-like repeat\nThis entry represents a repeat found in alpha integrins and related proteins in which form a 7-fold repeat that adopts a beta-propeller fold. This repeat contains a putative calcium-binding site. These repeats are found in multiple proteins from eukaryotes and bacteria and mediate diverse biological processes at both molecular and cellular levels, such as cell-cell interactions, host-pathogen recognition or innate immune responses.", "150": "PF13537\nGlutamine amidotransferase domain\nThis domain is a class-II glutamine amidotransferase domain found in a variety of enzymes such as asparagine synthetase and glutamine-fructose-6-phosphate transaminase.", "151": "PF16859\nTetracyclin repressor-like, C-terminal domain\nThis family of bacterial transcriptional repressors is characterised by the short approximately 50 amino acid stretch of residues constituting the helix-turn-helix DNA binding motif, around the YRFhY motif. The target proteins that are repressed are involved in the transcriptional control of multi-drug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity [PMID:15944459].", "152": "PF00327\nRibosomal protein L30p/L7e\nThis family includes prokaryotic L30 and eukaryotic L7.", "153": "PF16199\nRadical_SAM C-terminal domain\nThis domain is found as a C-terminal extension to a subset of Radical_SAM domains. It is found in archaeal, bacterial, fungal, plant and human proteins.", "154": "PF00586\nAIR synthase related protein, N-terminal domain\nThis family includes Hydrogen expression/formation protein HypE Swiss:P24193, AIR synthases Swiss:P08178 EC:6.3.3.1, FGAM synthase Swiss:P35852 EC:6.3.5.3 and selenide, water dikinase Swiss:P16456 EC:2.7.9.3. The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain [PMID:10508786].", "155": "PF04043\nPlant invertase/pectin methylesterase inhibitor\nThis domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex [PMID:8521860]. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension (see [PMID:10880981]). It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein [PMID:8521860]. It is also found at the N-termini of PMEs predicted from DNA sequences (personal obs:C Yeats), suggesting that both PMEs and their inhibitor are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical [PMID:10880981].", "156": "PF02260\nFATC domain\nThe FATC domain is named after FRAP, ATM, TRRAP C-terminal [PMID:10782091]. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability [PMID:15772072].", "157": "PF01957\nNfeD-like C-terminal, partner-binding\nNfeD-like proteins are widely distributed throughout prokaryotes and are frequently associated with genes encoding stomatin-like proteins (slipins). There appear to be three major groups: an ancestral group with only an N-terminal serine protease domain and this C-terminal beta sheet-rich domain which is structurally very similar to the OB-fold domain, associated with its neighbouring slipin cluster; a second major group with an additional middle, membrane-spanning domain, associated in some species with eoslipin and in others with yqfA; a final 'artificial' group which unites truncated forms lacking the protease region and associated with their ancestral gene partner, either yqfA or eoslipin. This NefD, C-terminal, domain appears to be the major one for relating to the associated protein. NfeD homologues are clearly reliant on their conserved gene neighbour which is assumed to be necessary for function, either through direct physical interaction or by functioning in the same pathway, possibly involve with lipid-rafts [PMID:20012272].", "158": "PF00288\nGHMP kinases N terminal domain\nThis family includes homoserine kinases, galactokinases and mevalonate kinases.", "159": "PF08542\nReplication factor C C-terminal domain\nThis is the C-terminal domain of RFC (replication factor-C) protein of the clamp loader complex which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A [PMID:15201901].", "160": "PF02457\nDisA bacterial checkpoint controller nucleotide-binding\nThe DisA protein is a bacterial checkpoint protein that dimerises into an octameric complex. The protein consists of three distinct domains. This domain is the first and is a globular, nucleotide-binding region; the next 146-289 residues constitute the DisA-linker family, Pfam:PF10635, that consists of an elongated bundle of three alpha helices (alpha-6, alpha-10, and alpha-11), one side of which carries an additional three helices (alpha7-9), which thus forms a spine like-linker between domains 1 and 3. The C-terminal residues, of domain 3, are represented by family HHH, Pfam:PF00633, the specific DNA-binding domain. The octameric complex thus has structurally linked nucleotide-binding and DNA-binding HhH domains and the nucleotide-binding domains are bound to a cyclic di-adenosine phosphate such that DisA is a specific di-adenylate cyclase. This N-terminal domain has been identified as a diadenylate cyclase (DAC) responsible for producing c-di-AMP from two molecules of ATP [PMID:32095817]. The di-adenylate cyclase activity is strongly suppressed by binding to branched DNA, but not to duplex or single-stranded DNA, suggesting a role for DisA as a monitor of the presence of stalled replication forks or recombination intermediates via DNA structure-modulated c-di-AMP synthesis [PMID:18439896].", "161": "PF07332\nPutative Actinobacterial Holin-X, holin superfamily III\nPhage_holin_3_6 is a family of small hydrophobic proteins with two or three transmembrane domains of the Hol-X family. Holin proteins are produced by double-stranded DNA bacteriophages that use an endolysin-holin strategy to achieve lysis of their hosts. The endolysins are peptidoglycan-degrading enzymes that are usually accumulated in the cytosol until access to the cell wall substrate is provided by the holin membrane lesion.", "162": "PF13408\nRecombinase zinc beta ribbon domain\nThis short bacterial protein contains a zinc ribbon domain that is likely to be DNA-binding. This domain is found in site specific recombinase proteins. This family appears most closely related to Pfam:PF04606.", "163": "PF01970\nTripartite tricarboxylate transporter TctA family\nThis family, formerly known as DUF112, is a family of bacterial and archaeal tripartite tricarboxylate transporters of the extracytoplasmic solute binding receptor-dependent transporter group of families, distinct from the ABC and TRAP-T families [PMID:14499931]. TctA is part of the tripartite TctABC system which, as characterised in S. typhimurium [PMID:6141166], is a secondary carrier that depends for activity on the extracytoplasmic tricarboxylate-binding receptor TctC as well as two integral membrane proteins, TctA and TctB. complete three-component systems are found only in bacteria. TctA is a large transmembrane protein with up to 12 predicted membrane spanning regions in bacteria and up to 11 such in archaea, with the N-terminal within the cytoplasm. TctA is thought to be a permease, and in most other bacteria functions without TctB and TctC molecules [PMID:14499931].", "164": "PF07538\nClostridial hydrophobic W\nA novel extracellular macromolecular system has been proposed based on the proteins containing ChW repeats [PMID:11466286]. ChW stands for Clostridial hydrophobic with conserved W (tryptophan). This repeat was originally described in Clostridium acetobutylicum but is also found in other Gram-positive bacteria including Enterococcus faecalis, Streptococcus agalactiae and Streptomyces coelicolor.", "165": "PF04973\nNicotinamide mononucleotide transporter\nMembers of this family are integral membrane proteins that are involved in transport of nicotinamide mononucleotide [PMID:2198247, PMID:2546921].", "166": "PF04003\nDip2/Utp12 Family\nThis domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2 Swiss:Q12220 [PMID:12068309].", "167": "PF01642\nMethylmalonyl-CoA mutase\nThe enzyme methylmalonyl-CoA mutase is a member of a class of enzymes that uses coenzyme B12 (adenosylcobalamin) as a cofactor. The enzyme induces the formation of an adenosyl radical from the cofactor. This radical then initiates a free-radical rearrangement of its substrate, succinyl-CoA, to methylmalonyl-CoA [PMID:8805541].", "168": "PF05191\nAdenylate kinase, active site lid\nComparisons of adenylate kinases have revealed a particular divergence in the active site lid. In some organisms, particularly the Gram-positive bacteria, residues in the lid domain have been mutated to cysteines and these cysteine residues are responsible for the binding of a zinc ion. The bound zinc ion in the lid domain, is clearly structurally homologous to Zinc-finger domains. However, it is unclear whether the adenylate kinase lid is a novel zinc-finger DNA/RNA binding domain, or that the lid bound zinc serves a purely structural function [PMID:9715904].", "169": "PF07703\nAlpha-2-macroglobulin bait region domain\nAlpha-2-macroglobulins (A2Ms) are plasma proteins that trap and inhibit a broad range of proteases and are major components of the eukaryotic innate immune system. However, A2M-like proteins were identified in pathogenically invasive bacteria and species that colonize higher eukaryotes. This domain is found in eukaryotic and bacterial proteins. In human A2Ms, this domain encompasses macroglobulin-like domain MG5 and 6 including bait region. In Salmonella enterica ser A2Ms, this domain encompasses MG7 and MG8 including the bait region [PMID:25221932] [PMID:22290936]. The Bait region is cleaved by proteases, followed by a large conformational change that blocks the target protease within a cage-like complex. This model of protease entrapment is recognised as the Venus flytrap mechanism [PMID:25221932].", "170": "PF13288\nDXP reductoisomerase C-terminal domain\nThis is the C-terminal domain of the 1-deoxy-D-xylulose-5-phosphate reductoisomerase enzyme. This domain forms a left handed super-helix.", "171": "PF13561\nEnoyl-(Acyl carrier protein) reductase\nThis domain is found in Enoyl-(Acyl carrier protein) reductases.", "172": "Ribosomal Proteins L2, C-terminal domain", "173": "PF06026\nRibose 5-phosphate isomerase A (phosphoriboisomerase A)\nThis family consists of several ribose 5-phosphate isomerase A or phosphoriboisomerase A (EC:5.3.1.6) from bacteria, eukaryotes and archaea. ", "174": "PF05552\nConserved TM helix\nThis alignment represents a conserved transmembrane helix as well as some flanking sequence. It is often found in association with Pfam:PF00924.", "175": "PF03443\nAuxiliary Activity family 9 (formerly GH61)\nAlthough weak endoglucanase activity has been demonstrated in several members of this family [1-3], they lack the clustered conserved catalytic acidic amino acids present in most glycoside hydrolases. Many members of this family lack measurable cellulase activity on their own, but enhance the activity of other cellulolytic enzymes. They are therefore unlikely to be true glycoside hydrolases [PMID:20230050]. The substrate-binding surface of this family is a flat Ig-like fold [PMID:24912171]. This family of enzymes were originally classified as glycoside hydrolases (GH61) and they have been reclassified as the Auxiliary Activity Family 9 (AA9) of CAZy.", "176": "PF02601\nExonuclease VII, large subunit\nThis family consist of exonuclease VII, large subunit EC:3.1.11.6 This enzyme catalyses exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones [PMID:6284744].", "177": "PF07486\nCell Wall Hydrolase\nThese enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance Swiss:P50739 is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex([PMID:10658652]). A similar role is carried out by the partially redundant Swiss:P42249 ([PMID:9515903]). It is not clear whether these enzymes are amidases or peptidases.", "178": "PF01813\nATP synthase subunit D\nThis is a family of subunit D form various ATP synthases including V-type H+ transporting and Na+ dependent. Subunit D is suggested to be an integral part of the catalytic sector of the V-ATPase [PMID:7831318].", "179": "PF02491\nSHS2 domain inserted in FTSA\nFtsA is essential for bacterial cell division, and co-localises to the septal ring with FtsZ. The SHS2 domain is inserted in to the RNAseH fold of FtsA [PMID:15281131], and is involved in protein-protein interaction [PMID:9352931].", "180": "PF04377\nArginine-tRNA-protein transferase, C terminus\nThis family represents the C terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilising amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified [PMID:9858543].", "181": "PF09362\nDomain of unknown function (DUF1996)\nThis family of proteins are functionally uncharacterised.", "182": "Fungal cellulose binding domain", "183": "Domain of unknown function (DUF4116)", "184": "PF12874\nZinc-finger of C2H2 type\nThis is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding.", "185": "PF17764\n3'DNA-binding domain (3'BD)\nThis domain represents the N-terminal DNA-binding domain found in the PriA protein. The 3'BD, which has been shown to bind the 3' end of the leading-strand arm of replication fork structures.", "186": "Ribosomal protein S5, C-terminal domain", "187": "PF00885\n6,7-dimethyl-8-ribityllumazine synthase\nThis family includes the beta chain of 6,7-dimethyl-8- ribityllumazine synthase EC:2.5.1.9, an enzyme involved in riboflavin biosynthesis. The family also includes a subfamily of distant archaebacterial proteins that may also have the same function for example Swiss:O28856. The family contains a number of different subsets including a family of proteins comprising archaeal lumazine and riboflavin synthases, type I lumazine synthases, and the eubacterial type II lumazine synthases [PMID:16923880]. It has been established that lumazine synthase catalyses the penultimate step in the biosynthesis of riboflavin in plants and microorganisms. The type I lumazine synthases area active in pentameric or icosahedral quaternary assemblies, whereas the type II are decameric [PMID:17854827]. Brucella, a bacterial genus that causes brucellosis, and other Rhizobiales have an atypical riboflavin metabolic pathway. Brucella spp code for both a type-I and a type-II lumazine synthase, and it has been shown that at least one of these two has to be present in order for Brucella to be viable, showing that in the case of Brucella flavin metabolism is implicated in bacterial virulence [PMID:20195542].", "188": "PF05257\nCHAP domain\nThis domain corresponds to an amidase function. Many of these proteins are involved in cell wall metabolism of bacteria. This domain is found at the N-terminus of Swiss:P43675, where it functions as a glutathionylspermidine amidase EC:3.5.1.78 [PMID:7775463]. This domain is found to be the catalytic domain of PlyCA [PMID:16818874]. CHAP is the amidase domain of bifunctional Escherichia coli glutathionylspermidine synthetase/amidase, and it catalyses the hydrolysis of Gsp (glutathionylspermidine) into glutathione and spermidine [PMID:21226054].", "189": "PF03315\nSerine dehydratase beta chain\nL-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway.", "190": "Chorismate synthase", "191": "PF01255\nPutative undecaprenyl diphosphate synthase\nPreviously known as uncharacterized protein family UPF0015, a single member of this family Swiss:O82827 has been identified as an undecaprenyl diphosphate synthase [PMID:9677368].", "192": "PF00013\nKH domain\nKH motifs bind RNA in vitro. Autoantibodies to Nova, a KH domain protein, cause paraneoplastic opsoclonus ataxia.", "193": "PF00241\nCofilin/tropomyosin-type actin-binding protein\nSevers actin filaments and binds to actin monomers.", "194": "PF04117\nMpv17 / PMP22 family\nThe 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information [PMID:11590176]. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis [PMID:11327696]. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock [PMID:15189984]. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH) [PMID:16582910][PMID:16909392]. MDDS is a clinically heterogeneous group of disorders characterised by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance [PMID:16909392].", "195": "PF02391\nMoaE protein\nThis family contains the MoaE protein that is involved in biosynthesis of molybdopterin [PMID:8514782]. Molybdopterin, the universal component of the pterin molybdenum cofactors, contains a dithiolene group serving to bind Mo. Addition of the dithiolene sulfurs to a molybdopterin precursor requires the activity of the converting factor. Converting factor contains the MoaE and MoaD proteins.", "196": "PF01139\ntRNA-splicing ligase RtcB\nThis family of RNA ligases (EC:6.5.1.3) join 2',3'-cyclic phosphate and 5'-OH ends. They catalyse the splicing of tRNA and may also participate in tRNA repair and recovery from stress-induced RNA damage [1-3].", "197": "PF02673\nBacitracin resistance protein BacA\nBacitracin resistance protein (BacA) is a putative undecaprenol kinase. BacA confers resistance to bacitracin, probably by phosphorylation of undecaprenol [PMID:8389741]. More recent studies show that BacA has undecaprenyl pyrophosphate phosphatase activity. Undecaprenyl phosphate is a key lipid intermediate involved in the synthesis of various bacterial cell wall polymers. Bacitracin, a mixture of related cyclic polypeptide antibiotics, is used to treat surface tissue infections. Its primary mode of action is the inhibition of bacterial cell wall synthesis through sequestration of the essential carrier lipid undecaprenyl pyrophosphate, C55-PP, resulting in the loss of cell integrity and lysis [PMID:15138271, PMID:15778224]. The characteristic phosphatase sequence-motif in this family is likely to be the PGxSRSGG, compared with the PSGH of the PAP family of phosphatases [PMID:15778224].", "198": "PF02445\nQuinolinate synthetase A protein\nQuinolinate synthetase catalyses the second step of the de novo biosynthetic pathway of pyridine nucleotide formation. In particular, quinolinate synthetase is involved in the condensation of dihydroxyacetone phosphate and iminoaspartate to form quinolinic acid [PMID:10648170]. This synthesis requires two enzymes, a FAD-containing \"B protein\" and an \"A protein\".", "199": "PF13445\nRING-type zinc-finger\nThis zinc-finger is a typical RING-type of plant ubiquitin ligases [PMID:15644464].", "200": "PF13472\nGDSL-like Lipase/Acylhydrolase family\nThis family of presumed lipases and related enzymes are similar to Pfam:PF00657.", "201": "PF13850\nEndoplasmic Reticulum-Golgi Intermediate Compartment (ERGIC)\nThis family is the N-terminal of ERGIC proteins [PMID:15308636], ER-Golgi intermediate compartment clusters, otherwise known as Ervs, and is associated with family COPIIcoated_ERV, Pfam:PF07970.", "202": "PF04024\nPspC domain\nThis family includes Phage shock protein C (PspC) that is thought to be a transcriptional regulator. The presumed domain is 60 amino acid residues in length.", "203": "PF04390\nLipopolysaccharide-assembly\nLptE (formerly known as RplB) is involved in lipopolysaccharide-assembly on the outer membrane of Gram-negative organisms. The lipopolysaccharide component of the outer bacterial membrane is transported from its source of origin to the outer membrane by a set of proteins constituting a transport machinery that is made up of LptA, LptB, LptC, LptD, LptE. LptD appears to be anchored in the outer membrane, and LptE forms a complex with it. This part of the machinery complex is involved in the assembly of lipopolysaccharide in the outer leaflet of the outer membrane [PMID:18424520].", "204": "PF16653\nSaccharopine dehydrogenase C-terminal domain\nThis family comprises the C-terminal domain of saccharopine dehydrogenase. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase.", "205": "Ribosomal protein L34", "206": "PF05161\nMOFRL family\nMOFRL(multi-organism fragment with rich Leucine) family exists in bacteria and eukaryotes. The function of this domain is not clear, although it exists in some putative enzymes such as reductases and kinases.", "207": "PF01149\nFormamidopyrimidine-DNA glycosylase N-terminal domain\nFormamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidised purines from damaged DNA. This family is the N-terminal domain contains eight beta-strands, forming a beta-sandwich with two alpha-helices parallel to its edges [PMID:11912217].", "208": "PF11799\nimpB/mucB/samB family C-terminal domain\nThese proteins are involved in UV protection (Swiss).", "209": "PF12344\nUltra-violet resistance protein B\nThis domain family is found in bacteria, archaea and eukaryotes, and is approximately 40 amino acids in length. The family is found in association with Pfam:PF00271, Pfam:PF02151, Pfam:PF04851. There are two conserved sequence motifs: YAD and RRR. This family is the C terminal region of the UvrB protein which conveys mutational resistance against UV light to various different species.", "210": "PF01302\nCAP-Gly domain\nCytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein Swiss:Q20728 CAP-Gly domain was recently solved [PMID:12221106]. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove [PMID:12221106].", "211": "PF04810\nSec23/Sec24 zinc finger\nCOPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.", "212": "PF09723\nZinc ribbon domain\nThis entry represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence.", "213": "PF16016\nVAD1 Analog of StAR-related lipid transfer domain\nThe VASt (VAD1 Analog of StAR-related lipid transfer) domain is conserved across eukaryotes and is structurally related to Bet v1-like domains, including START lipid-binding domains. The 190-amino acid VASt domain is predominantly associated with lipid binding domains such as GRAM Pfam:PF02893, C2 and PH domains. The VASt domain is likely to have a function in binding large hydrophobic ligands and may be specific for sterol [PMID:24965341, PMID:26001273]. The predicted structure of the VASt domain is a two-layer sandwich alpha beta fold, also called 'helix grip fold', containing three alpha helices (alpha1 to 3), six beta-sheets (beta1 to 6) and two loops (omega1 and 2) numbered from N to C terminus [PMID:24965341]. Some proteins known to contain a VASt domain are : Plant vascular associated death1 (VAD1), a regulator of programmed cell death (PCD) harboring a GRAM putative lipid-binding domain. Yeast SNF1 Interacting Protein 3 (SIP3), may be involved in sterol transfer between intracellular membranes. Yeast Suicide Protein 1 (YSP1), a mitochondrial protein specifically required for the mitochondrial thread-grain transition, de-energization, and the cell death. May be involved in sterol transfer between intracellular membranes. Yeast Suicide Protein 2 (YSP2), a mitochondrial membrane protein involved in mitochondrial fragmentation and may be involved in sterol transfer between intracellular membranes.It is also found in human GramD1a-c.", "214": "PF16901\nC-terminal domain of alpha-glycerophosphate oxidase\nDAO_C is the C-terminal region of alpha-glycerophosphate oxidase.", "215": "PTS system sorbose subfamily IIB component", "216": "PF00037\n4Fe-4S binding domain\nSuperfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich.", "217": "FHIPEP family", "218": "PF08766\nDEK C terminal domain\nDEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients [PMID:7504406]. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family [PMID:15238633]. This domain is also found in chitin synthase proteins like Swiss:Q8TF96, and in protein phosphastases such as Swiss:Q6NN85.", "219": "PF12775\nP-loop containing dynein motor region\nThis domain is found in human cytoplasmic dynein-2 proteins. Cytoplasmic dynein-2 (dynein-2) performs intraflagellar transport and is associated with human skeletal ciliopathies. Dyneins share a conserved motor domain that couples cycles of ATP hydrolysis with conformational changes to produce movement. Structural analysis reveal that the motor's ring consists of six AAA+ domains (ATPases associated with various cellular activities (AAA1-AAA6). This is the third nucleotide binding sites in the dynein motor. However, AAA3 has lost the catalytic residues necessary for ATP hydrolysis (the Walker B glutamate, the arginine finger, sensor-I and sensor-II motifs) [PMID:25470043].", "220": "PF00902\nSec-independent protein translocase protein (TatC)\nThe bacterial Tat system has a remarkable ability to transport folded proteins even enzyme complexes across the cytoplasmic membrane. It is structurally and mechanistically similar to the Delta pH-driven thylakoidal protein import pathway. A functional Tat system or Delta pH-dependent pathway requires three integral membrane proteins: TatA/Tha4, TatB/Hcf106 and TatC/cpTatC. The TatC protein is essential for the function of both pathways. It might be involved in twin-arginine signal peptide recognition, protein translocation and proton translocation. Sequence analysis predicts that TatC contains six transmembrane helices (TMHs), and experimental data confirmed that N- and C-termini of TatC or cpTatC are exposed to the cytoplasmic or stromal face of the membrane. The cytoplasmic N-terminus and the first cytoplasmic loop region of the Escherichia coli TatC protein are essential for protein export. At least two TatC molecules co-exist within each Tat translocon [PMID:12163163].", "221": "PF03479\nPlants and Prokaryotes Conserved (PCC) domain\nThis is a Plants and Prokaryotes Conserved (PPC) domain found in proteins that contain AT-hook motifs Pfam:PF02178 which strongly suggests a DNA-binding function for the proteins as a whole. Proteins with PPC domains are found in Bacteria, Archaea and the plant kingdom [PMID:15604740, PMID:17295322].The PPC domain has a single alpha-helix packed against an antiparallel beta-sheet, which is formed by five beta-strands. There are three highly conserved histidine residues, eg at 117, 119 and 133 in Swiss:Q46QL5 which appear to form a zinc-binding site, and the domain has been observed to form homotrimers. The domain co-occurs with a thioredoxin-like domain in uncharacterized cyanobacterial proteins [PMID:17295322].", "222": "PF14278\nTranscriptional regulator C-terminal region\nThis domain is a tetracycline repressor, domain 2, or C-terminus.", "223": "PF10150\nRibonuclease E/G family\nRibonuclease E and Ribonuclease G are related enzymes that cleave a wide variety of RNAs [PMID:16237448].", "224": "PF01702\nQueuine tRNA-ribosyltransferase\nThis is a family of queuine tRNA-ribosyltransferases EC:2.4.2.29, also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine. It catalyses the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position [PMID:8654383, PMID:8323579]. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues [PMID:8654383].", "225": "Ribosomal protein L14p/L23e", "226": "PF02322\nCytochrome bd terminal oxidase subunit II\nThis family consists of cytochrome bd type terminal oxidases that catalyse quinol-dependent, Na+-independent oxygen uptake [PMID:8626304]. Members of this family are integral membrane proteins and contain a protohaem IX centre B558. One member of the family Swiss:O05192 is implicated in having an important role in micro-aerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae [PMID:9274021]. The family forms an integral functional unit with subunit I, family Bac_Ubq_Cox, Pfam:PF01654.", "227": "PF03740\nPyridoxal phosphate biosynthesis protein PdxJ\nMembers of this family belong to the PdxJ family that catalyses the condensation of 1-deoxy-d-xylulose-5-phosphate (DXP) and 1-amino-3-oxo-4-(phosphohydroxy)propan-2-one to form pyridoxine 5'-phosphate (PNP). This reaction is involved in de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate [PMID:10225425].", "228": "PF16488\nArgonaute linker 2 domain\nArgoL2 is the second linker domain in eukaryotic argonaute proteins. It starts with two alpha-helices aligned orthogonally to each other followed by a beta-strand involved in linking the two lobes, the PAZ lobe and the Piwi lobe of argonaute to each other. Linker 2 together with the N, PAZ and L1 domains form a compact global fold [PMID:16061186]. Numerous residues from Piwi, L1 and L2 linkers direct the path of the phosphate backbone of nucleotides 7-9, thus allowing DNA-slicing [PMID:23746446].", "229": "PF09397\nFtsk gamma domain\nThis domain directs oriented DNA translocation and forms a winged helix structure [PMID:17057717]. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding [PMID:17057717].", "230": "Dynein light chain type 1", "231": "PF08310\nLGFP repeat\nThis 54 amino acid repeat is found in many hypothetical proteins. Several hypothetical proteins from C.glutamicum and C.efficiens along with PS1 protein contain this repeat region. The N-terminus region of PS1 contains an esterase domain which transfers corynomycolic acid. The C-terminus region consists of 4 tandem LGFP repeats. It is hypothesised that the PS1 proteins in Corynebacterium, when associated with the cell wall, may be anchored via the LGFP tandem repeats that may be important for maintaining cell wall integrity [Adindla et al. Comparative and Functional Genomics 2004; 5:2-16]. Deletion of Swiss:Q01377 protein results in a 10-fold increase in the cell volume of the organism and infers the corresponding proteins involvement in the cell shape formation [PMID:12740729]. The secondary structure of each repeat is predicted to comprise two beta-strands and one alpha-helix [Adindla et al. 2004].", "232": "PF02844\nPhosphoribosylglycinamide synthetase, N domain\nPhosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the N-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see Pfam:PF00289). This domain is structurally related to the PreATP-grasp domain.", "233": "PF02518\nHistidine kinase-, DNA gyrase B-, and HSP90-like ATPase\nThis family represents the structurally related ATPase domains of histidine kinase, DNA gyrase B and HSP90.", "234": "PF03484\ntRNA synthetase B5 domain\nThis domain is found in phenylalanine-tRNA synthetase beta subunits.", "235": "FliP family", "236": "PF00593\nTonB dependent receptor\nThis model now only covers the conserved part of the barrel structure.", "237": "PF00300\nHistidine phosphatase superfamily (branch 1)\nThe histidine phosphatase superfamily is so named because catalysis centres on a conserved His residue that is transiently phosphorylated during the catalytic cycle. Other conserved residues contribute to a 'phosphate pocket' and interact with the phospho group of substrate before, during and after its transfer to the His residue. Structure and sequence analyses show that different families contribute different additional residues to the 'phosphate pocket' and, more surprisingly, differ in the position, in sequence and in three dimensions, of a catalytically essential acidic residue. The superfamily may be divided into two main branches. The larger branch 1 contains a wide variety of catalytic functions, the best known being fructose 2,6-bisphosphatase (found in a bifunctional protein with 2-phosphofructokinase) and cofactor-dependent phosphoglycerate mutase. The latter is an unusual example of a mutase activity in the superfamily: the vast majority of members appear to be phosphatases. The bacterial regulatory protein phosphatase SixA is also in branch 1 and has a minimal, and possible ancestral-like structure, lacking the large domain insertions that contribute to binding of small molecules in branch 1 members.", "238": "PF02609\nExonuclease VII small subunit\nThis family consist of exonuclease VII, small subunit EC:3.1.11.6 This enzyme catalyses exonucleolytic cleavage in either 5'->3' or 3'->5' direction to yield 5'-phosphomononucleotides. This exonuclease VII enzyme is composed of one large subunit and 4 small ones [PMID:6284744].", "239": "PF01798\nsnoRNA binding domain, fibrillarin\nThis family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) Swiss:Q12499 from yeast is the protein component of a ribonucleoprotein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex [PMID:9632712]. Nop56p Swiss:O00567 and Nop5p interact with Nop1p and are required for ribosome biogenesis [PMID:9372940]. Prp31p Swiss:p49704 is required for pre-mRNA splicing in S. cerevisiae [PMID:8604353]. Fibrillarin, or Nop, is the catalytic subunit responsible for the methyl transfer reaction of the site-specific 2'-O-methylation of ribosomal and spliceosomal RNA [PMID:20864039].", "240": "PF00936\nBMC domain\nBacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure [PMID:16081736].", "241": "Ribosomal protein L16p/L10e", "242": "PF17676\nLD-carboxypeptidase C-terminal domain\nMuramoyl-tetrapeptide carboxypeptidase hydrolyses a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein Swiss:Q47511. This family corresponds to Merops family S66.", "243": "PF04229\nGrpB protein\nThis family has been suggested to belong to the nucleotidyltransferase superfamily [PMID:19833706]. It occurs at the C-terminus of dephospho-CoA kinase (CoaE) in a number of cases, where it plays a role in the proper folding of the enzyme [PMID:19876400].", "244": "PQQ-like domain", "245": "PF02787\nCarbamoyl-phosphate synthetase large chain, oligomerisation domain\nCarbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. ", "246": "PF01510\nN-acetylmuramoyl-L-alanine amidase\nThis family includes zinc amidases that have N-acetylmuramoyl-L-alanine amidase activity EC:3.5.1.28. This enzyme domain cleaves the amide bond between N-acetylmuramoyl and L-amino acids in bacterial cell walls (preferentially: D-lactyl-L-Ala). The structure is known for the bacteriophage T7 structure and shows that two of the conserved histidines are zinc binding.", "247": "PF07971\nGlycosyl hydrolase family 92\nMembers of this family are alpha-1,2-mannosidases, enzymes which remove alpha-1,2-linked mannose residues from Man(9)(GlcNAc)(2) by hydrolysis. They are critical for the maturation of N-linked oligosaccharides and ER-associated degradation [PMID:10026209].", "248": "PF06172\nCupin superfamily (DUF985)\nFamily of uncharacterised proteins found in bacteria and eukaryotes that belongs to the Cupin superfamily.", "249": "PF02582\nRMND1/Sif2-Sif3/Mrx10, DUF155\nThis entry represents a domain found in RMND1 from mammals, Sif2/Sif3 from fission yeasts and Rmd1/Rmd8/YDR282C (Mrx10) from budding yeasts. RMND1 and its yeast homologue, Mrx10, are mitochondrial proteins required for mitochondrial translation [PMID:23022098, PMID:23022099]. Rmd1 and Rmd8 are cytoplasmic proteins required for sporulation [PMID:12586695].", "250": "PF06071\nProtein of unknown function (DUF933)\nThis domain is found at the C terminus of the YchF GTP-binding protein (Swiss:O13998) and is possibly related to the ubiquitin-like and MoaD/ThiS superfamilies. ", "251": "PF03797\nAutotransporter beta-domain\nSecretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type V pathway was first described for the IgA1 protease [PMID:3027577]. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C terminus of the proteins it occurs in. The N terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different protease is used and in some cases no cleavage occurs [PMID:9778731].", "252": "PF10135\nRod binding protein\nMembers of this family are involved in the assembly of the prokaryotic flagellar rod.", "253": "PF01722\nBolA-like protein\nThis family consist of the morphoprotein BolA from E. coli and its various homologues. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division [PMID:10361282]. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase [PMID:10361282]. BolA is also induced by stress during early stages of growth [PMID:10361282] and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5 [PMID:2684651, PMID:10361282].", "254": "PF12804\nMobA-like NTP transferase domain\nThis family includes the MobA protein (Molybdopterin-guanine dinucleotide biosynthesis protein A). The family also includes a wide range of other NTP transferase domain.", "255": "PF14319\nTransposase zinc-binding domain\nThis domain is likely to be a zinc-binding domain. It is found at the N-terminus of transposases belonging to the IS91 family.", "256": "PF03796\nDnaB-like helicase C terminal domain\nThe hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerisation of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis.", "257": "PF17940\nTetracyclin repressor-like, C-terminal domain\nThis is the C-terminal domain found in putative transcriptional regulator, TetR family proteins.", "258": "PF00338\nRibosomal protein S10p/S20e\nThis family includes small ribosomal subunit S10 from prokaryotes and S20 from eukaryotes.", "259": "PF00673\nribosomal L5P family C-terminus\nThis region is found associated with Pfam:PF00281.", "260": "PF00149\nCalcineurin-like phosphoesterase\nThis family includes a diverse range of phosphoesterases [PMID:9685491], including protein phosphoserine phosphatases, nucleotidases, sphingomyelin phosphodiesterases and 2'-3' cAMP phosphodiesterases as well as nucleases such as bacterial SbcD Swiss:P13457 or yeast MRE11 Swiss:P32829. The most conserved regions in this superfamily centre around the metal chelating residues.", "261": "PF00699\nUrease beta subunit\nThis subunit is known as alpha in Heliobacter.", "262": "PF03466\nLysR substrate binding domain\nThe structure of this domain is known and is similar to the periplasmic binding proteins [PMID:9309218]. This domain binds a variety of ligands that caries in size and structure, such as amino acids, sugar phosphates, organic acids, metal cations, flavonoids, C6-ring carboxylic acids, H2O2, HOCl, homocysteine, NADPH, ATP, sulphate, muropeptides, acetate, salicylate, citrate, phenol- and quinolone derivatives, acetylserines, fatty acid CoA, shikimate, chorismate, homocysteine, indole-3-acetic acid, Na(I), c-di-GMP, ppGpp and hydrogen peroxide (Matilla et. al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).", "263": "PF17803\nBacterial cadherin-like domain\nThis entry contains numerous bacterial cadherin-like domains found in extracelullar proteins.", "264": "PF02277\nPhosphoribosyltransferase\nThis family of proteins represent the nicotinate-nucleotide- dimethylbenzimidazole phosphoribosyltransferase (NN:DBI PRT) enzymes involved in dimethylbenzimidazole synthesis. This function is essential to de novo cobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) from Salmonella enterica plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin [PMID:12101181].", "265": "PF13305\nTetracyclin repressor-like, C-terminal domain\nThis domain is around 80 amino acids in length. It is found to the C-terminus of a DNA-binding helix-turn-helix domain. This domain may be involved transcriptional regulation predicted to belong to the TetR family. This domain contains three conserved residues (WHG) near the C-terminus.", "266": "PF04413\n3-Deoxy-D-manno-octulosonic-acid transferase (kdotransferase)\nMembers of this family transfer activated sugars to a variety of substrates, including glycogen, fructose-6-phosphate and lipopolysaccharides. Members of the family transfer UDP, ADP, GDP or CMP linked sugars. The Glycos_transf_N region is flanked at the N-terminus by a signal peptide and at the C-terminus by Glycos_transf_1 (Pfam:PF00534). The eukaryotic glycogen synthases may be distant members of this bacterial family [PMID:10952982].", "267": "Transforming growth factor beta like domain", "268": "PF02367\nThreonylcarbamoyl adenosine biosynthesis protein TsaE\nThis family of proteins is involved in the synthesis of threonylcarbamoyl adenosine (t(6)A) [1-2].", "269": "Prolipoprotein diacylglyceryl transferase", "270": "PF04515\nPlasma-membrane choline transporter\nThis family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals [PMID:15002745].", "271": "PF13634\nNucleoporin FG repeat region\nThis family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145.", "272": "Putative transposase of IS4/5 family (DUF4096)", "273": "PF02571\nPrecorrin-6x reductase CbiJ/CobK\nThis family consists of Precorrin-6x reductase EC:1.3.1.54. This enzyme catalyses the reaction: precorrin-6Y + NADP(+) <=> precorrin-6X + NADPH. CbiJ and CobK both catalyse the reduction of macocycle in the colbalmin biosynthesis pathway [PMID:9742225, PMID:8501034].", "274": "PF05697\nBacterial trigger factor protein (TF)\nIn the E. coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains [PMID:12603737]. This family represents the N-terminal region of the protein.", "275": "Alanine racemase, C-terminal domain", "276": "PF02570\nPrecorrin-8X methylmutase\nThis is a family Precorrin-8X methylmutases also known as Precorrin isomerase, CbiC/CobH, EC:5.4.1.2. This enzyme catalyses the reaction: Precorrin-8X <=> hydrogenobyrinate. This enzyme is part of the Cobalamin (vitamin B12) biosynthetic pathway and catalyses a methyl rearrangement [PMID:9742225, PMID:8501034].", "277": "PF07811\nTadE-like protein\nThe members of this family are similar to a region of the protein product of the bacterial tadE locus (Swiss:Q9S4A6). In various bacterial species, the tad locus is closely linked to flp-like genes, which encode proteins required for the production of pili involved in adherence to surfaces [PMID:11553455]. It is thought that the tad loci encode proteins that act to assemble or export an Flp pilus in various bacteria [PMID:11553455]. All tad loci but TadA have putative transmembrane regions [PMID:11553455], and in fact the region in question is this family has a high proportion of hydrophobic amino acid residues.", "278": "Dehydroquinase class II", "279": "Ribosomal protein S5, N-terminal domain", "280": "PF00189\nRibosomal protein S3, C-terminal domain\nThis family contains a central domain Pfam:PF00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer.", "281": "PF16326\nABC transporter C-terminal domain\nThis domain is found at the C-terminus of ABC transporters. It has a coiled coil structure with an atypical 3(10)-helix in the alpha-hairpin region. It is involved in DNA_binding [PMID:22995754].", "282": "PF05485\nTHAP domain\nThe THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes [PMID:12575992].", "283": "PF06961\nProtein of unknown function (DUF1294)\nThis family includes a number of hypothetical bacterial and archaeal proteins of unknown function.", "284": "PF07927\nHicA toxin of bacterial toxin-antitoxin,\nHicA_toxin is a bacterial family of toxins that act as mRNA interferases. The antitoxin that neutralises this is family HicB, Pfam:PF15919 [PMID:16895922, PMID:19060138, PMID:21927020].", "285": "PF02482\nSigma 54 modulation protein / S30EA ribosomal protein\nThis Pfam family contains the sigma-54 modulation protein family and the S30AE family of ribosomal proteins which includes the light- repressed protein (lrtA) (Swiss:P47908) [PMID:8063707].", "286": "PF14510\nABC-transporter N-terminal\nThis domain is found at the N-terminus of ABC-transporter proteins from fungi, plants to higher eukaryotes. It is predicted to be an intracellular domain [PMID:8294477, PMID:14711635, PMID:21034832].", "287": "PF02511\nThymidylate synthase complementing protein\nThymidylate synthase complementing protein (Thy1) complements the thymidine growth requirement of the organisms in which it is found, but shows no homology to thymidylate synthase. The bacterial members of this family at least are flavin-dependent thymidylate synthases [PMID:12029065, PMID:15046578, PMID:17890305].", "288": "Acyl CoA binding protein", "289": "PF01239\nProtein prenyltransferase alpha subunit repeat\nBoth farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family [PMID:12702202].", "290": "5'-3' exonuclease, C-terminal SAM fold", "291": "PF12911\nN-terminal TM domain of oligopeptide transport permease C\nOligopeptide permeases (Opp) have been identified in numerous gram-negative and -positive bacteria. These transport systems belong to the superfamily of highly conserved ATP-binding cassette transporters. Typically, Opp importers comprise a complex of five proteins. The oligopeptide-binding protein OppA is responsible for the capture of peptides from the external medium. Two integral highly hydrophobic membrane spanning proteins, OppB and OppC, form a channel through the membrane used for peptide translocation. This N-terminal domain appears to be the first TM domain of the molecule [PMID:1738314].", "292": "PF13004\nPutative binding domain, N-terminal\nThe BACON (Bacteroidetes-Associated Carbohydrate-binding Often N-terminal) domain is an all-beta domain found in diverse architectures, principally in combination with carbohydrate-active enzymes and proteases. These architectures suggest a carbohydrate-binding function which is also supported by the nature of BACON's few conserved amino-acids. The phyletic distribution of BACON and other data tentatively suggest that it may frequently function to bind mucin [PMID:20416301]. Further work with the characterised structure of a member of glycoside hydrolase family 5 enzyme, PDB:3ZMR, has found no evidence for carbohydrate-binding for this domain [PMID:24463512].", "293": "PF13559\nDomain of unknown function (DUF4129)\nThis presumed domain is found at the C-terminus of proteins that contain a transglutaminase core domain. The function of this domain is unknown. The domain has a conserved TXXE motif.", "294": "PF02194\nPXA domain\nThis domain is associated with PX domains Pfam:PF00787.", "295": "PF01944\nStage II sporulation protein M\nSpoIIM is on e of four stage II sporulation proteins that is necessary for the forespore inside the mother-cell to be properly internalised through the breakdown of peptidoglycans trapped between the membranes of the septum separating the forespore and the mother-cell. The four proteins working in sequence are SpoIIB, Pfam:PF05036, SpoIIM, SpoIIP, Pfam:PF07454, and finally SpoIID, Pfam:PF08486. D, M and P are in a complex with each other and the complex assembles in a hierarchical manner such that M, which serves as a membrane anchor, recruits P to the septum and P, in turn, recruits D to the septum [PMID:17376078].", "296": "PF02569\nPantoate-beta-alanine ligase\nPantoate-beta-alanine ligase, also know as pantothenate synthase, (EC:6.3.2.1) catalyses the formation of pantothenate from pantoate and alanine [PMID:374975].", "297": "Glutamyl-tRNAGlu reductase, dimerisation domain", "298": "PF17678\nGlycosyl hydrolase family 92 N-terminal domain\nThis domain is found at the N-terminus of family 92 glycosyl hydrolase proteins.", "299": "PF03028\nDynein heavy chain region D6 P-loop domain\nThis family represents the C-terminal region of dynein heavy chain. The chain also contains ATPase activity and microtubule binding ability and acts as a motor for the movement of organelles and vesicles along microtubules. Dynein is also involved in cilia and flagella movement. The dynein subunit consists of at least two heavy chains and a number of intermediate and light chains [PMID:7866389]. The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This C-terminal domain carries the D6 region of the dynein motor where the P-loop has been lost in evolution but the general structure of a potential ATP binding site appears to be retained [PMID:11250194].", "300": "PF06912\nProtein of unknown function (DUF1275)\nThis family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown although most members have 6 TM regions, and may be putative permeases.", "301": "PF00681\nPlectin repeat\nThis family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen.", "302": "PF01336\nOB-fold nucleic acid binding domain\nThis family contains OB-fold domains that bind to nucleic acids [PMID:10829230]. The family includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl -tRNA synthetases (See Pfam:PF00152). Aminoacyl-tRNA synthetases catalyse the addition of an amino acid to the appropriate tRNA molecule EC:6.1.1.-. This family also includes part of RecG helicase involved in DNA repair. Replication factor A is a hetero-trimeric complex, that contains a subunit in this family [PMID:7760808, PMID:8990123]. This domain is also found at the C-terminus of bacterial DNA polymerase III alpha chain.", "303": "PF10017\nHistidine-specific methyltransferase, SAM-dependent\nThe mycobacterial members of this family are expressed from part of the ergothioneine biosynthetic gene cluster. EGTD is the histidine methyltransferase that transfers three methyl groups to the alpha-amino moiety of histidine, in the first stage of the production of this histidine betaine derivative that carries a thiol group attached to the C2 atom of an imidazole ring [PMID:20420449].", "304": "NADH dehydrogenase", "305": "PF17910\nFeoB cytosolic helical domain\nFeoB is a G-protein coupled membrane protein essential for Fe(II) uptake in prokaryotes. In the structures, a canonical G-protein domain (G domain) is followed by a helical bundle domain (S-domain) which is represented by this entry.", "306": "PF04961\nFormiminotransferase-cyclodeaminase\nMembers of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family.", "307": "PF13802\nGalactose mutarotase-like\nThis family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily.", "308": "PF01259\nSAICAR synthetase\nAlso known as Phosphoribosylaminoimidazole-succinocarboxamide synthase.", "309": "PF13742\nOB-fold nucleic acid binding domain\nThis family contains OB-fold domains that bind to nucleic acids.", "310": "PF02771\nAcyl-CoA dehydrogenase, N-terminal domain\nThe N-terminal domain of Acyl-CoA dehydrogenase is an all-alpha domain.", "311": "PF02690\nNa+/Pi-cotransporter\nThis is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule [PMID:9826740].", "312": "Phosphatidylethanolamine-binding protein", "313": "PF03705\nCheR methyltransferase, all-alpha domain\nCheR proteins are part of the chemotaxis signaling mechanism in bacteria. CheR methylates the chemotaxis receptor at specific glutamate residues. CheR is an S-adenosylmethionine- dependent methyltransferase.", "314": "PF10728\nDomain of unknown function (DUF2520)\nThis presumed domain is found C-terminal to a Rossmann-like domain suggesting that these proteins are oxidoreductases.", "315": "Translation initiation factor IF-3, N-terminal domain", "316": "PF12464\nMaltose acetyltransferase\nThis domain family is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with Pfam:PF00132. Mac uses acetyl-CoA as acetyl donor to acetylated cytoplasmic maltose.", "317": "Signal peptide binding domain", "318": "PF18074\nPrimosomal protein N C-terminal domain\nThis is the C-terminal domain found in PriA DNA helicase, a multifunctional enzyme that mediates the process of restarting prematurely terminated DNA replication reactions in bacteria. The C-terminal domain (CTD) bears similarity to the S10 subunit which binds branched rRNA within the bacterial ribosome. The C-terminal domain is part of the helicase domain of PriA proteins. It acts together with the 3' DNA-binding domain to form a site for binding ssDNA-binding protein (SSB) [PMID:24379377].", "319": "PF07497\nRho termination factor, RNA-binding domain\nThe Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers [PMID:10230401].", "320": "PF04430\nProtein of unknown function (DUF498/DUF598)\nThis is a large family of uncharacterised proteins found in all domains of life. The structure shows a novel fold with three beta sheets. A dimeric form is found in the crystal structure. It was suggested that the cleft in between the two monomers might bing nucleic acid [PMID:11746696].", "321": "PF03124\nEXS family\nWe have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX Pfam:PF03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences [PMID:9990033] is thus borne out in Pfam:PF03105 and this family. While the N-termini aligned in Pfam:PF03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins Swiss:P16151. ERD1 proteins are involved in the localisation of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localisation label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles [PMID:2178921].", "322": "PF03372\nEndonuclease/Exonuclease/phosphatase family\nThis large family of proteins includes magnesium dependent endonucleases and a large number of phosphatases involved in intracellular signalling [PMID:10838565]. This family includes: AP endonuclease proteins EC:4.2.99.18 e.g Swiss:P27695, DNase I proteins EC:3.1.21.1 e.g. Swiss:P24855, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase EC:3.1.3.56 Swiss:O43426, Sphingomyelinase EC:3.1.4.12 Swiss:P11889 and Nocturnin Swiss:O35710.", "323": "Prokaryotic diacylglycerol kinase", "324": "PF04298\nPutative neutral zinc metallopeptidase\nZinc metallopeptidase zinc binding regions have been predicted in some family members by a pattern match (Prosite:PS00142), of the characteristic HEXXH motif.", "325": "PF00115\nCytochrome C and Quinol oxidase polypeptide I\nCytochrome c oxidase (E.C:7.1.1.9) is a key enzyme in aerobic metabolism. Proton pumping haem-copper oxidases represent the terminal, energy-transfer enzymes of respiratory chains in prokaryotes and eukaryotes. The CuB-haem a3 (or haem o) binuclear centre, associated with the largest subunit I of cytochrome c and ubiquinol oxidases (E.C:1.10.3.11), is directly involved in the coupling between dioxygen reduction and proton pumping [PMID:8638158, PMID:8013452, PMID:31489376, PMID:8049679]. Some terminal oxidases generate a transmembrane proton gradient across the plasma membrane (prokaryotes) or the mitochondrial inner membrane (eukaryotes). The enzyme complex consists of 3-4 subunits (prokaryotes) up to 13 polypeptides (mammals) of which only the catalytic subunit (equivalent to mammalian subunit I (COXI) is found in all haem-copper respiratory oxidases. The presence of a bimetallic centre (formed by a high-spin haem and copper B) as well as a low-spin haem, both ligated to six conserved histidine residues near the outer side of four transmembrane spans within CO I is common to all family members [PMID:8013452, PMID:31489376].", "326": "PF16220\nDomain of unknown function (DUF4880)\nThis domain can be found on the N-terminal of uncharacterized proteins from various Rhodopseudomonas and Pseudomonas species, often, but not always followed by the ron siderophore sensor protein family (FecR, PF04773). The function of this domain is unknown.", "327": "PF00624\nFlocculin repeat\nThis short repeat is rich in serine and threonine residues.", "328": "PF14791\nDNA polymerase beta thumb\nThe catalytic region of DNA polymerase beta is split into three domains. An N-terminal fingers domain, a central palm domain and a C-terminal thumb domain. This entry represents the thumb domain [PMID:7516581].", "329": "PF03134\nTB2/DP1, HVA22 family\nThis family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein (e.g. Swiss:Q00765), which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein (e.g. Swiss:Q07764), which is thought to be a regulatory protein.", "330": "PF04815\nSec23/Sec24 helical domain\nCOPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.", "331": "PF02863\nArginine repressor, C-terminal domain\nThis is the C-terminal domain of the arginine repressor, responsible for arginine binding and multimerization [PMID:9334747, PMID:11856827]. It binds mainly Arg, but also ornithine, Pro and Tyr (Matilla et. al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).", "332": "PTS system sorbose-specific iic component", "333": "PF17852\nDynein heavy chain AAA lid domain\nThis entry corresponds to the extension domain of AAA domain 5 in the dynein heavy chain [PMID:22398446]. This domain is composed of 8 alpha helices [PMID:22398446].", "334": "PF01687\nRiboflavin kinase\nThis family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism [PMID:9473052]. This domain is thought to have kinase activity [PMID:15468322].", "335": "Ribosomal protein L9, C-terminal domain", "336": "PF08669\nGlycine cleavage T-protein C-terminal barrel domain\nThis is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ", "337": "PF01832\nMannosyl-glycoprotein endo-beta-N-acetylglucosaminidase\nThis family includes Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase EC:3.2.1.96. As well as the flageller protein J Swiss:P75942 that has been shown to hydrolyse peptidoglycan [PMID:10049388].", "338": "Alanine racemase, N-terminal domain", "339": "PF02244\nCarboxypeptidase activation peptide\nCarboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme.", "340": "PF07004\nSperm-tail PG-rich repeat\nThis family represents a short conserved region carrying a PGP motif that is repeated in eukaryotic proteins of sperm-tails. Shippo orthologues from some species may include up to 40 Pro-Gly-Pro repeats.", "341": "PF02607\nB12 binding domain\nThis B12 binding domain is found in methionine synthase EC:2.1.1.13 Swiss:Q99707, and other shorter proteins that bind to B12. This domain is always found to the N-terminus of Pfam:PF02310. The structure of this domain is known [PMID:7992050], it is a 4 helix bundle. Many of the conserved residues in this domain are involved in B12 binding, such as those in the MXXVG motif.", "342": "PF03748\nFlagellar basal body-associated protein FliL\nThis FliL protein controls the rotational direction of the flagella during chemotaxis [PMID:3519573]. FliL is a cytoplasmic membrane protein associated with the basal body [PMID:10439416].", "343": "PF02503\nPolyphosphate kinase middle domain\nPolyphosphate kinase (Ppk) catalyses the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules.", "344": "Putative vitamin uptake transporter", "345": "PF06422\nCDR ABC transporter\nCorresponds to a region of the PDR/CDR subgroup of ABC transporters comprising extracellular loop 3, transmembrane segment 6 and linker region.", "346": "PF14031\nPutative serine dehydratase domain\nThis domain is found at the C-terminus of yeast D-serine dehydratase [PMID:17937657]. Structures have been solved for two bacterial members of this family. The yeast protein has been shown to be a zinc dependant enzyme.", "347": "PF05494\nMlaC protein\nMlaC is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane [PMID:19383799]. This family of proteins is involved in toluene tolerance, which is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation [2-3]. Many proteins are involved in these processes.", "348": "UbiA prenyltransferase family", "349": "PF04327\nCysteine protease Prp\nThis is a family of cysteine protease that are found to cleave the N-terminus extension of ribosomal subunit L27 in eubacteria. Proteins in this family are distinguished by a pair of invariant histidine and cysteine residues with conserved spacing that form the classic catalytic dyad of a cysteine protease [PMID:25388641].", "350": "PF09527\nPutative F0F1-ATPase subunit Ca2+/Mg2+ transporter\nThis model represents a protein found encoded in F1F0-ATPase operons in several genomes, including Methanosarcina barkeri (archaeal) and Chlorobium tepidum (bacterial). It is a small protein (about 100 amino acids) with long hydrophobic stretches and is presumed to be a subunit of the enzyme. It carries two transmembrane helices and is a magnesium or calcium uniporter. The atp operon of alkaliphilic Bacillus pseudofirmus OF4, as in most prokaryotes, contains the eight structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes a membrane protein with 2 TMSs. A tenth gene, atpZ, has been found in this operon, which is upstream of and overlapping with atpI [PMID:12917488].", "351": "PF13280\nWYL domain\nWYL is a Sm-like SH3 beta-barrel fold containing domain. It is a member of the WYL-like superfamily, named for three conserved amino acids found in a subset of the superfamily. However, these residues are not strongly conserved throughout the family. Rather, the conservation pattern includes four basic residues and a position often occupied by a cysteine [PMID:24817877], which are predicted to line a ligand-binding groove typical of the Sm-like SH3 beta-barrels. A WYL domain protein (sll7009) is a negative regulator of the I-D CRISPR-Cas system in Synechocystis sp [PMID:23535141]. It is predicted to be a ligand-sensing domain that could bind negatively charged ligands, such as nucleotides or nucleic acid fragments, to regulate CRISPR-Cas and other defense systems such as the abortive infection AbiG system.", "352": "PF09719\nPutative redox-active protein (C_GCAxxG_C_C)\nThis entry represents a putative redox-active protein of about 140 residues, with four perfectly conserved Cys residues. It includes a CGAXXG motif. Most members are found within one or two loci of transporter or oxidoreductase genes. A member from Geobacter sulfurreducens, located in a molybdenum transporter operon, has a TAT (twin-arginine translocation) signal sequence for Sec-independent transport across the plasma membrane, a hallmark of bound prosthetic groups such as FeS clusters.", "353": "Vitamin B12 dependent methionine synthase, activation domain", "354": "PF17759\nPhenylalanyl tRNA synthetase beta chain CLM domain\nThis domain corresponds to the catalytic like domain (CLM) in the beta chain of phe tRNA synthetase [PMID:21082706].", "355": "PF00830\nRibosomal L28 family\nThe ribosomal 28 family includes L28 proteins from bacteria and chloroplasts. The L24 protein from yeast Swiss:P36525 also contains a region of similarity to prokaryotic L28 proteins. L24 from yeast is also found in the large ribosomal subunit", "356": "PF02502\nRibose/Galactose Isomerase\nThis family of proteins contains the sugar isomerase enzymes ribose 5-phosphate isomerase B (rpiB), galactose isomerase subunit A (LacA) and galactose isomerase subunit B (LacB). ", "357": "CBF/Mak21 family", "358": "Anaerobic ribonucleoside-triphosphate reductase", "359": "PF02660\nGlycerol-3-phosphate acyltransferase\nThis family of enzymes catalyses the transfer of an acyl group from acyl-ACP to glycerol-3-phosphate to form lysophosphatidic acid [PMID:16949372]].", "360": "Aconitase family (aconitate hydratase)", "361": "PF00662\nNADH-Ubiquinone oxidoreductase (complex I), chain 5 N-terminus\nThis sub-family represents an amino terminal extension of Pfam:PF00361. Only NADH-Ubiquinone chain 5 and eubacterial chain L are in this family. This sub-family is part of complex I which catalyses the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane.", "362": "PF08487\nVault protein inter-alpha-trypsin domain\nInter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition [PMID:14744536]. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (Pfam:PF00092) in ITI heavy chains (ITIHs) and their precursors.", "363": "PF13959\nDomain of unknown function (DUF4217)\nThis short domain is found at the C-terminus of many helicase proteins.", "364": "PF00343\nCarbohydrate phosphorylase\nThe members of this family catalyse the formation of glucose 1-phosphate from one of the following polyglucoses; glycogen, starch, glucan or maltodextrin.", "365": "Large-conductance mechanosensitive channel, MscL", "366": "PF03808\nGlycosyl transferase WecG/TagA/CpsF family\nThe WecG member of this family, believed to be UDP-N-acetyl-D-mannosaminuronic acid transferase, plays a role in enterobacterial common antigen (eca) synthesis in Escherichia coli. Another family member, the Bacillus subtilis TagA protein, is involved in the biosynthesis of the cell wall polymer poly(glycerol phosphate). The third family member, CpsF, CMP-N-acetylneuraminic acid synthetase has a role in the capsular polysaccharide biosynthesis pathway. Also included in this group is Xanthomonas campestris pv. campestris GumM, a glycosyltransferase participating in the biosynthesis of the exopolysaccharide xanthan [PMID:8830246, PMID:11673418, PMID:12618464, PMID:18156271, PMID:16953575, PMID:3275612, PMID:9537354].", "367": "PF16177\nAcetyl-coenzyme A synthetase N-terminus\nThis domain is found at the N-terminus of many acetyl-coenzyme A synthetase enzymes.", "368": "Peptidyl-tRNA hydrolase", "369": "PF02559\nCarD-like/TRCF domain\nCarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes [PMID:8692912]. This family includes the presumed N-terminal domain, CdnL. CarD interacts with the zinc-binding protein CarG to form a complex that regulates multiple processes in Myxococcus xanthus [PMID:16879646]. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF (transcription-repair-coupling factor) proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription [PMID:7876261]. This domain is involved in binding to the stalled RNA polymerase [PMID:7876261]. The family includes members otherwise referred to as CdnL, for CarD N-terminal like, which differ functionally from CarD. The TRCF domain mentioned above is the RNA polymerase-interacting domain or RID [PMID:20371514].", "370": "PF02548\nKetopantoate hydroxymethyltransferase\nKetopantoate hydroxymethyltransferase (EC:2.1.2.11) is the first enzyme in the pantothenate biosynthesis pathway.", "371": "Ribosomal protein L17", "372": "PF14490\nHelix-hairpin-helix containing domain\nThis presumed domain contains at least one helix-hairpin-helix motif. This domain is often found in RecD helicases.", "373": "PF03727\nHexokinase\nHexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and Pfam:PF00349. Some members of the family have two copies of each of these domains.", "374": "PF04408\nHelicase associated domain (HA2)\nThis presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding.", "375": "Protein of unknown function (DUF433)", "376": "PF04015\nDomain of unknown function (DUF362)\nDomain that is sometimes present in iron-sulphur proteins.", "377": "PF13396\nPhospholipase_D-nuclease N-terminal\nThis family is often found at the very N-terminus of proteins from the phospholipase_D-nuclease family, PLDc, Pfam:PF00614. However, a large number of members are full-length within this family.", "378": "PF04461\nProtein of unknown function (DUF520)\nThe structure of the DUF520 family of uncharacterised proteins shows that it has composed of two domains each of which has the same topology [PMID:12943362].", "379": "PF01679\nProteolipid membrane potential modulator\nPmp3 is an evolutionarily conserved proteolipid in the plasma membrane which, in S. pombe, is transcriptionally regulated by the Spc1 stress MAPK (mitogen-activated protein kinases) pathway. It functions to modulate the membrane potential, particularly to resist high cellular cation concentration. In eukaryotic organisms, stress-activated mitogen-activated protein kinases play crucial roles in transmitting environmental signals that will regulate gene expression for allowing the cell to adapt to cellular stress. Pmp3-like proteins are highly conserved in bacteria, yeast, nematode and plants.", "380": "NUDIX domain", "381": "PF02207\nPutative zinc finger in N-recognin (UBR box)\nThis region is found in E3 ubiquitin ligases that recognise N-recognins [PMID:16055722].", "382": "PF01535\nPPR repeat\nThis repeat has no known function. It is about 35 amino acids long and found in up to 18 copies in some proteins. This family appears to be greatly expanded in plants. This repeat occurs in PET309 Swiss:P32522 that may be involved in RNA stabilisation [PMID:7664742]. This domain occurs in crp1 that is involved in RNA processing [PMID:8039510]. This repeat is associated with a predicted plant protein Swiss:O49549 that has a domain organisation similar to the human BRCA1 protein. The repeat has been called PPR [PMID:10664580].", "383": "PF13545\nCrp-like helix-turn-helix domain\nThis family represents a crp-like helix-turn-helix domain that is likely to bind DNA.", "384": "PF03073\nTspO/MBR family\nTryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light [PMID:7673149]. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates [PMID:10681549]. In [PMID:9144197], the rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues.", "385": "PF01940\nIntegral membrane protein DUF92\nMembers of this family have several predicted transmembrane helices. One member of the family has been characterised as protein PGR (AtPGR). PGR is suggested to be a potential glucose-responsive regulator in carbohydrate metabolism in plants. This entry also includes protein VTE6, which is a Pphytyl-phosphate kinase catalysing the conversion of phytyl-monophosphate to phytyl-diphosphate [PMID:26452599].", "386": "NADH-ubiquinone oxidoreductase-F iron-sulfur binding region", "387": "PF00571\nCBS domain\nCBS domains are small intracellular modules that pair together to form a stable globular domain [PMID:10200156]. This family represents a single CBS domain. Pairs of these domains have been termed a Bateman domain [PMID:14722609]. CBS domains have been shown to bind ligands with an adenosyl group such as AMP, ATP and S-AdoMet [PMID:14722619]. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl carrying ligands. The region containing the CBS domains in Cystathionine-beta synthase is involved in regulation by S-AdoMet [PMID:11524006]. CBS domain pairs from AMPK bind AMP or ATP [PMID:14722619]. The CBS domains from IMPDH and the chloride channel CLC2 bind ATP [PMID:14722619].", "388": "PF02225\nPA domain\nThe PA (Protease associated) domain is found as an insert domain in diverse proteases. The PA domain is also found in a plant vacuolar sorting receptor Swiss:O22925 and members of the RZF family Swiss:O43567. It has been suggested that this domain forms a lid-like structure that covers the active site in active proteases, and is involved in protein recognition in vacuolar sorting receptors [PMID:11246007].", "389": "Beige/BEACH domain", "390": "PF13774\nRegulated-SNARE-like domain\nLongin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex [PMID:16855025]. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterised by a conserved N-terminal domain with a profilin-like fold called a longin domain [PMID:15544955].", "391": "PF05025\nRbsD / FucU transport protein family\nThe Escherichia coli high-affinity ribose-transport system consists of six proteins encoded by the rbs operon (rbsD, rbsA, rbsC, rbsB, rbsK and rbsR). RbsD was originally thought to be a high affinity ribose transport protein, but further analysis [PMID:16731978] shows that it is a D-ribose pyranase . It catalyzes the interconversion of beta-pyran and beta-furan forms of D-ribose. It also catalyzes the conversion between beta-allofuranose and beta-allopyranose. This family also includes FucU a component of the fucose operon and is a L-fucose mutarotase, involved in the anomeric conversion of L-fucose. It also exhibits a pyranase activity for D-ribose [PMID:16731978]. Both have been classified in the RbsD/FucU family of proteins. Members of this family are ubiquitous having been found in organisms from eubacteria to mammals.", "392": "Phosphoglycerate kinase", "393": "PF00393\n6-phosphogluconate dehydrogenase, C-terminal domain\nThis family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each.", "394": "PF08494\nDEAD/H associated\nThis domain is found in ATP-dependent helicases as well as a number of hypothetical proteins together with the helicase conserved C-terminal domain (Pfam:PF00270) and the Pfam:PF00271 domain.", "395": "PF13356\nArm DNA-binding domain\nThis DNA-binding domain is found at the N-terminus of a wide variety of phage integrase proteins.", "396": "Eukaryotic-type carbonic anhydrase", "397": "Phosphoglucomutase/phosphomannomutase, alpha/beta/alpha domain III", "398": "PF01430\nHsp33 protein\nHsp33 is a molecular chaperone, distinguished from all other known chaperones by its mode of functional regulation. Its activity is redox regulated. Hsp33 is a cytoplasmically localised protein with highly reactive cysteines that respond quickly to changes in the redox environment. Oxidising conditions like H2O2 cause disulfide bonds to form in Hsp33, a process that leads to the activation of its chaperone function [PMID:10025400].", "399": "PF02424\nApbE family\nThis prokaryotic family of lipoproteins are related to ApbE from Salmonella typhimurium. ApbE is involved in thiamine synthesis [PMID:9473043]. It acts as an FAD:protein FMN-transferase, catalysing the attachment of an FMN residue to a threonine residue of a protein via a phosphoester bond in such bacterial flavoproteins [PMID:23558683].", "400": "PF08522\nDomain of unknown function (DUF1735)\nThis domain of unknown function is found in a number of bacterial proteins including acylhydrolases. The structure of this domain has a beta-sandwich fold.", "401": "PF16212\nPhospholipid-translocating P-type ATPase C-terminal\nPhoLip_ATPase_C is found at the C-terminus of a number of phospholipid-translocating ATPases. It is found in higher eukaryotes.", "402": "PF13234\nrRNA-processing arch domain\nMtr4 is the essential RNA helicase, and is an exosome-activating cofactor. This arch domain is carried in Mtr4 and Ski2 (the cytosolic homologue of Mtr4). The arch domain is required for proper 5.8S rRNA processing, and appears to function independently of canonical helicase activity [PMID:20512111].", "403": "PF00617\nRasGEF domain\nGuanine nucleotide exchange factor for Ras-like small GTPases.", "404": "PF11929\nDomain of unknown function (DUF3447)\nThis presumed domain is functionally uncharacterised. This domain is found in eukaryotes. This domain is about 80 amino acids in length. This domain is found associated with Pfam:PF00023. This domain has a conserved SHN sequence motif. It seems likely that this region represents divergent Ankyrin repeats.", "405": "PF02572\nATP:corrinoid adenosyltransferase BtuR/CobO/CobP\nThis family consists of the BtuR, CobO, CobP proteins all of which are Cob(I)alamin adenosyltransferase, EC:2.5.1.17, involved in cobalamin (vitamin B12) biosynthesis. These enzymes catalyse the adenosylation reaction: ATP + cob(I)alamin + H2O <=> phosphate + diphosphate + adenosylcobalamin.", "406": "PF02733\nDak1 domain\nThis is the kinase domain of the dihydroxyacetone kinase family EC:2.7.1.29. ", "407": "PF02843\nPhosphoribosylglycinamide synthetase, C domain\nPhosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see Pfam:PF02787).", "408": "PF09383\nNIL domain\nThis domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family.", "409": "PF09118\nGalactose oxidase-like, Early set domain\nE or 'early' set domains are associated with the catalytic domain of galactose oxidase at the C-terminal end. Galactose oxidase is an extracellular monomeric enzyme which catalyzes the stereospecific oxidation of a broad range of primary alcohol substrates, and possesses a unique mononuclear copper site essential for catalysing a two-electron transfer reaction during the oxidation of primary alcohols to corresponding aldehydes. The second redox active centre necessary for the reaction was found to be situated at a tyrosine residue. The C-terminal domain of galactose oxidase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end, and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family adopt a secondary structure consisting of a bundle of seven, mostly antiparallel, beta-strands surrounding a hydrophobic core. The 7 strands are arranged in 2 sheets, in a Greek-key topology [PMID:11698678]. This domain is found in sugar-utilising enzymes, such as galactose oxidase or chitinase [2-6].", "410": "PF06441\nEpoxide hydrolase N terminus\nThis family represents the N-terminal region of the eukaryotic epoxide hydrolase protein. Epoxide hydrolases (EC:3.3.2.3) comprise a group of functionally related enzymes that catalyse the addition of water to oxirane compounds (epoxides), thereby usually generating vicinal trans-diols. EHs have been found in all types of living organisms, including mammals, invertebrates, plants, fungi and bacteria. In animals, the major interest in EH is directed towards their detoxification capacity for epoxides since they are important safeguards against the cytotoxic and genotoxic potential of oxirane derivatives that are often reactive electrophiles because of the high tension of the three-membered ring system and the strong polarization of the C--O bonds. This is of significant relevance because epoxides are frequent intermediary metabolites which arise during the biotransformation of foreign compounds [PMID:10548561]. This family is often found in conjunction with Pfam:PF00561.", "411": "RecR protein", "412": "PF04982\nHPP family\nThese proteins are integral membrane proteins with four transmembrane spanning helices. The most conserved region of the alignment is a motif HPP. The function of these proteins is uncertain but they may be transporters.", "413": "PF01066\nCDP-alcohol phosphatidyltransferase\nAll of these members have the ability to catalyse the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond.", "414": "PF13376\nBacteriocin-protection, YdeI or OmpD-Associated\nThis is a family of archaeal and bacterial proteins predicted to be periplasmic. YdeI is important for resistance to polymyxin B in broth and for bacterial survival in mice upon oral, but not intraperitoneal inoculation, suggesting a role for YdeI in the gastrointestinal tract of mice [PMID:17010160]. Production of the ydeI gene is regulated by the Rcs (regulator of capsule synthesis) phospho-relay system pathway independently of RcsA, and additionally transcription of the protein is regulated by the stationary-phase sigma factor, RpoS (sigma-S) [PMID:17010160]. YdeI confers protection against cationic AMPs (Antimicrobial peptides) or bacteriocins in conjunction with the general porin Omp, thus justifying its name of OmdA, for OmpD-Associated protein [PMID:19767429].", "415": "PF01106\nNifU-like domain\nThis is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown [PMID:8048161].", "416": "Dehydratase family", "417": "PF07735\nF-box associated\nMost of these proteins contain Pfam:PF00646 at the N terminus, suggesting that they are effectors linked with ubiquitination.", "418": "PF07664\nFerrous iron transport protein B C terminus\nEscherichia coli has an iron(II) transport system (feo) which may make an important contribution to the iron supply of the cell under anaerobic conditions [PMID:8407793]. FeoB has been identified as part of this transport system. FeoB is a large 700-800 amino acid integral membrane protein. The N-terminus has been previously erroneously described as being ATP-binding [PMID:8407793]. Recent work shows that it is similar to eukaryotic G-proteins and that it is a GTPase [PMID:12446835]. ", "419": "Ribosomal protein L13", "420": "PF12937\nF-box-like\nThis is an F-box-like family.", "421": "PF10385\nRNA polymerase beta subunit external 1 domain\nRNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared with three in eukaryotes (not including mitochondrial or chloroplast polymerases). This domain in prokaryotes spans the gap between domains 4 and 5 of the yeast protein. It is also known as the external 1 region of the polymerase and is bound in association with the external 2 region [PMID:11313498].", "422": "PF02590\nPredicted SPOUT methyltransferase\nThis family of proteins are predicted to be SPOUT methyltransferases [PMID:17338813].", "423": "PF01018\nGTP1/OBG\nThe N-terminal domain of Swiss:P20964 has the OBG fold, which is formed by three glycine-rich regions inserted into a small 8-stranded beta-sandwich these regions form six left-handed collagen-like helices packed and H-bonded together.", "424": "PF02538\nHydantoinase B/oxoprolinase\nThis family includes N-methylhydaintoinase B which converts hydantoin to N-carbamyl-amino acids, and 5-oxoprolinase (Swiss:P97608) EC:3.5.2.9 which catalyses the formation of L-glutamate from 5-oxo-L-proline. These enzymes are part of the oxoprolinase family and are related to Pfam:PF01968.", "425": "ATP synthase", "426": "PF17876\nCold shock domain\nCrystallographic structure analysis of E. coli wild-type RNase II revealed that the amino-terminal region starts with an alpha-helix followed by two consecutive five-stranded anti-parallel beta-barrels, identified as cold-shock domains (CSD1 and CSD2). This entry relates to CSD2 which lacks the typical sequence motifs RNPI and RNPII but contributes to RNA binding [PMID:16996291] [PMID:16957732].", "427": "PF08044\nDomain of unknown function (DUF1707)\nThis domain is found in a variety of Actinomycetales proteins. All of the proteins containing this domain are hypothetical and probably membrane bound or associated. Currently, it is unclear to the function of this domain.", "428": "Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding domain", "429": "PF01392\nFz domain\nAlso known as the CRD (cysteine rich domain), the C6 box in MuSK receptor. This domain of unknown function has been independently identified by several groups [PMID:9637908, PMID:9637909, PMID:9684897, PMID:9852758]. The domain contains 10 conserved cysteines.", "430": "PF02657\nFe-S metabolism associated domain\nThis family consists of the SufE-related proteins. These have been implicated in Fe-S metabolism and export [PMID:11251816]).", "431": "PF01027\nInhibitor of apoptosis-promoting Bax1\nProgrammed cell-death involves a set of Bcl-2 family proteins, some of which inhibit apoptosis (Bcl-2 and Bcl-XL) and some of which promote it (Bax and Bak). Human Bax inhibitor, BI-1, is an evolutionarily conserved integral membrane protein containing multiple membrane-spanning segments predominantly localised to intracellular membranes. It has 6-7 membrane-spanning domains. The C termini of the mammalian BI-1 proteins are comprised of basic amino acids resembling some nuclear targeting sequences, but otherwise the predicted proteins lack motifs that suggest a function. As plant BI-1 appears to localise predominantly to the ER, we hypothesized that plant BI-1 could also regulate cell death triggered by ER stress [PMID:19704470]. BI-1 appears to exert its effect through an interaction with calmodulin [PMID:19742129]. The budding yeast member of this family has been found unexpectedly to encode a BH3 domain-containing protein (Ybh3p) that regulates the mitochondrial pathway of apoptosis in a phylogenetically conserved manner [PMID:21673659]. Examination of the crystal structure of a bacterial member of this family shows that these proteins mediate a calcium leak across the membrane that is pH-dependent. Calcium homoeostasis balances passive calcium leak with active calcium uptake. The structure exists in a pore-closed and pore-open conformation, at pHs of 8 and 6 respectively, and the pore can be opened by intracrystalline transition; together these findings suggest that pH controls the conformational transition [PMID:24904158].", "432": "PF02405\nPermease MlaE\nMlaE is a permease which in E. coli is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane [PMID:19383799]. In Swiss:Q7DD59 it is involved in L-glutamate import into the cell [PMID:16495545]. In Swiss:Q8L4R0 it is involved in lipid transfer within the cell [PMID:16495545].", "433": "PF00253\nRibosomal protein S14p/S29e\nThis family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes.", "434": "PF08345\nFlagellar M-ring protein C-terminal\nThis domain is found in bacterial flagellar M-ring (FliF) proteins together with the YscJ/FliF domain (Pfam:PF01514).", "435": "PF01329\nPterin 4 alpha carbinolamine dehydratase\nPterin 4 alpha carbinolamine dehydratase is also known as DCoH (dimerisation cofactor of hepatocyte nuclear factor 1-alpha).", "436": "PF00883\nCytosol aminopeptidase family, catalytic domain\nThe two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase.", "437": "PF18759\nPlavaka transposase\nA transposase with an RNaseH catalytic domain that often has a histone binding BAM/BAH domain at the C-terminus and is sometimes associated with TET/JBP family of dioxygenases in fungi [PMID:24398522].", "438": "PF10369\nSmall subunit of acetolactate synthase\nALS_ss_C is the C-terminal half of a family of proteins which are the small subunits of acetolactate synthase. Acetolactate synthase is a tetrameric enzyme, containing probably two large and two small subunits, which catalyses the first step in branched-chain amino acid biosynthesis. This reaction is sensitive to certain herbicides [PMID:9197540].", "439": "Zinc finger, C3HC4 type (RING finger)", "440": "PF01730\nUreF\nThis family consists of the Urease accessory protein UreF. The urease enzyme (urea amidohydrolase) hydrolyses urea into ammonia and carbamic acid [PMID:8550495]. UreF is proposed to modulate the activation process of urease by eliminating the binding of nickel irons to noncarbamylated protein [PMID:8808930].", "441": "PF05949\nBacterial protein of unknown function (DUF881)\nThis family consists of a series of hypothetical bacterial proteins. One of the family members Swiss:Q45543 from Bacillus subtilis is thought to be involved in cell division and sporulation [PMID:2556375].", "442": "PF02698\nDUF218 domain\nThis large family of proteins contains several highly conserved charged amino acids, suggesting this may be an enzymatic domain (Bateman A pers. obs). The family includes SanA Swiss:P33017 that is involved in Vancomycin resistance [PMID:8550448]. This protein may be involved in murein synthesis [PMID:9738879].", "443": "PF13508\nAcetyltransferase (GNAT) domain\nThis domain catalyses N-acetyltransferase reactions.", "444": "PF04248\nDomain of unknown function (DUF427)\nThis domain contains a beta-tent fold [PMID:25569776].", "445": "PF01926\n50S ribosome-binding GTPase\nThe full-length GTPase protein is required for the complete activity of the protein of interacting with the 50S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotide.", "446": "PF17042\nNucleotide-binding C-terminal domain\nThis is the C-terminal domain found in proteins in a range of Proteobacteria as well as the Gram-positive Oceanobacillus iheyensis. Structural analysis of the whole protein indicates the N- and C-termini act together to produce a surface into which a threonate-ADP complex is bound, demonstrating that a sugar binding site is on the N-terminal domain, and a nucleotide binding site is in the C-terminal domain [PMID:27402745]. There is a critical motif, DDXTG, at approximately residues 22-25. Proteins containing this domain have been predicted as kinases. Some members are associated with PdxA2 by physical clustering and gene fusion with PdxA2. Some members that are fused with PdxA2 have been shown to be involved in L-4-hydroxythreonine (4HT) phosphorylation, part of the alternative pathway to make PLP (pyridoxal 5'-phosphate) out of a toxic metabolite, 4HT. However, 4HT phosphorylation might not be the main function of this group of proteins. Moreover, some members that are not associated with pdxA2, and even one that is associated with pdxA2, have lost 4HT kinase activity [PMID:27294475]. Functional analysis demonstrate that family members include D-Threonate kinases (DtnK), D-Erythronate kinases (DenK) and 3-Oxo-tetronate kinases (OtnK) [PMID:27402745].", "447": "PF00380\nRibosomal protein S9/S16\nThis family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes.", "448": "SecY", "449": "PF13277\nYmdB-like protein\nThis family of putative phosphoesterases contains the B. subtilis protein YmdB Swiss:O31775.", "450": "PF04960\nGlutaminase\nThis family of enzymes deaminates glutamine to glutamate EC:3.5.1.2.", "451": "PF09989\nCoA enzyme activase uncharacterised domain (DUF2229)\nMembers of this family include various bacterial hypothetical proteins, as well as CoA enzyme activases. The exact function of this domain has not, as yet, been defined.", "452": "PF04860\nPhage portal protein\nBacteriophage portal proteins form a dodecamer and is located at a five-fold vertex of the viral capsid. The portal complex forms a channel through which the viral DNA is packaged into the capsid, and exits during infection. The portal protein is though to rotate during DNA packaging [PMID:11839289]. Portal proteins from different phage show little sequence homology, so this family does not represent all portal proteins.", "453": "PF03938\nOuter membrane protein (OmpH-like)\nThis family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterised as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery [PMID:15304217].", "454": "PF08436\n1-deoxy-D-xylulose 5-phosphate reductoisomerase C-terminal domain\nThis domain is found to the C-terminus of Pfam:PF02670 domains in bacterial and plant 1-deoxy-D-xylulose 5-phosphate reductoisomerases which catalyse the formation of 2-C-methyl-D-erythritol 4-phosphate from 1-deoxy-D-xylulose-5-phosphate in the presence of NADPH [PMID:9707569].", "455": "Ribosomal protein S21", "456": "PF18766\nSWI2/SNF2 ATPase\nA SWi2/SNF2 ATPase found in polyvalent proteins [PMID:28559295].", "457": "Cullin family", "458": "DNA polymerase family A", "459": "PF09363\nXFP C-terminal domain\nBacterial enzyme splits fructose-6-P and/or xylulose-5-P with the aid of inorganic phosphate into either acetyl-P and erythrose-4-P and/or acetyl-P and glyeraldehyde-3-P EC:4.1.2.9, EC:4.1.2.22 [PMID:11292814].", "460": "PF13927\nImmunoglobulin domain\nThis family contains immunoglobulin-like domains.", "461": "PF00006\nATP synthase alpha/beta family, nucleotide-binding domain\nThis entry includes the ATP synthase alpha and beta subunits, the ATP synthase associated with flagella and the termination factor Rho.", "462": "PF10035\nUncharacterized protein conserved in bacteria (DUF2179)\nThis domain, found in various hypothetical bacterial proteins, has no known function.", "463": "PF03747\nADP-ribosylglycohydrolase\nThis family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine [PMID:8349667]. The family also includes dinitrogenase reductase activating glycohydrolase [PMID:2506427]. Most surprisingly the family also includes jellyfish crystallins [PMID:2506427], these proteins appear to have lost the presumed active site residues.", "464": "PF12806\nAcetyl-CoA dehydrogenase C-terminal like\nthis domain would appear to be the very C-terminal region of many bacterial acetyl-CoA dehydrogenases.", "465": "PF01618\nMotA/TolQ/ExbB proton channel family\nThis family groups together integral membrane proteins that appear to be involved translocation of proteins across a membrane. These proteins are probably proton channels. MotA is an essential component of the flageller motor that uses a proton gradient to generate rotational motion in the flageller [PMID:10348868]. ExbB is part of the TonB-dependent transduction complex. The TonB complex uses the proton gradient across the inner bacterial membrane to transport large molecules across the outer bacterial membrane.", "466": "PF18198\nDynein heavy chain AAA lid domain\nThis family represents the AAA lid domain found neat the C-terminal region of dynein heavy chain.", "467": "PF00611\nFes/CIP4, and EFC/F-BAR homology domain\nAlignment extended from [PMID:9210375]. Highly alpha-helical. The cytosolic endocytic adaptor proteins in fungi carry this domain at the N-terminus; several of these have been referred to as muniscin proteins [PMID:19713939]. These N-terminal BAR, N-BAR, and EFC/F-BAR domains are found in proteins that regulate membrane trafficking events by inducing membrane tubulation. The domain dimerises into a curved structure that binds to liposomes and either senses or induces the curvature of the membrane bilayer to cause biophysical changes to the shape of the bilayer; it also thereby recruits other trafficking factors, such as the GTPase dynamin. Most EFC/F-BAR domain-family members localise to actin-rich structures [PMID:16938488].", "468": "jmjN domain", "469": "PF02256\nIron hydrogenase small subunit\nThis family represents the small subunit of the Fe-only hydrogenases EC:1.18.99.1. The subunit is comprised of alternating random coil and alpha helical structures that encompasses the large subunit in a novel protein fold [PMID:10368269].", "470": "PF01357\nExpansin C-terminal domain\nThis domain is found at the C-terminus of expansins, plant cell wall proteins involved in the non-enzymatic rearrangement of cell walls during cell growth. It contains the allergens lol PI, PII and PIII from Lolium perenne.", "471": "Polyprenyl synthetase", "472": "PF00391\nPEP-utilising enzyme, mobile domain\nThis domain is a \"swivelling\" beta/beta/alpha domain which is thought to be mobile in all proteins known to contain it.", "473": "PF02410\nRibosomal silencing factor during starvation\nThis family is expressed by almost all bacterial and eukaryotic genomes but not by archaea. Its function is to down-regulate protein synthesis under conditions of nutrient shortage, and it does this by binding to protein L14 of the large ribosomal subunit, thus acting as a ribosomal silencing factor (RsfS) by blocking the joining of the ribosomal subunits [PMID:22829778]. This family is structurally homologous to nucleotidyltransferases.", "474": "WD domain, G-beta repeat", "475": "PF01043\nSecA preprotein cross-linking domain\nThe SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain [PMID:12242434]. ", "476": "PF01773\nNa+ dependent nucleoside transporter N-terminus\nThis family consists of nucleoside transport proteins. Swiss:Q62773 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane [PMID:7775409]. Swiss:Q62674 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC [PMID:8027026]. This alignment covers the N terminus of this family", "477": "PF09365\nConserved hypothetical protein (DUF2461)\nMembers of this family are widely (though sparsely) distributed bacterial proteins, about 230 residues in length. All members have a motif RxxRDxRFxxx[DN]KxxY. The function of this protein family is unknown.", "478": "Leucyl/phenylalanyl-tRNA protein transferase", "479": "PF04066\nMultiple resistance and pH regulation protein F (MrpF / PhaF)\nMembers of the PhaF / MrpF family are predicted to be an integral membrane proteins with three transmembrane regions, involved in regulation of pH. PhaF is part of a potassium efflux system involved in pH regulation. It is also involved in symbiosis in Rhizobium meliloti [PMID:9680201]. MrpF is part of a Na+/H+ antiporter complex, also involved in pH homeostasis. MrpF is thought to be an efflux system for Na+ and cholate [PMID:10198001]. The Mrp system in Bacilli may also have primary energisation capacities [PMID:11356194].", "480": "Histidinol dehydrogenase", "481": "PF00397\nWW domain\nThe WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.", "482": "PF03489\nSaposin-like type B, region 2\nSaposin B is a small non-enzymatic glycoprotein required for the breakdown of cerebroside sulphates (sulphatides) in lysosomes. Saposin B contains three intramolecular disulphide bridges, exists as a dimer and is remarkably heat, protease and pH stable. The crystal structure of human saposin B reveals an unusual shell-like dimer consisting of a monolayer of alpha-helices enclosing a large hydrophobic cavity [PMID:7610480, PMID:12518053]. It is one of the most studied members of the saposin protein family and it is involved in the hydrolysis of glycolipids and glycerolipids. SapB is unique in the saposin family in that it facilitates degradation by interacting with the substrate, not the enzymes [PMID:26616259].", "483": "PF03458\nGlycine transporter\nThis domain contains three transmembrane helices. Proteins containing this domain are important for glycine utilisation, being identified as glycine transporters. Some members of the family are also important for alanine utilisation. In these proteins this domain is found in pairs [PMID:29769716]. An archaeal member of this family which contains this domain is a TRIC-type potassium channel [PMID:28524849].", "484": "PF01176\nTranslation initiation factor 1A / IF-1\nThis family includes both the eukaryotic translation factor eIF-1A and the bacterial translation initiation factor IF-1.", "485": "Family 4 glycosyl hydrolase C-terminal domain", "486": "PF02643\nUncharacterized ACR, COG1430\nTwo structures have been solved for members of this large (>500 members) family of bacterial proteins present mostly in environmental bacteria and metagenomes (distant homologues are also present in several Plasmodium species). TOPSAN analysis for pdb:3pjy shows that there is much similarity with the other solved structure, pdb:3m7a, solved for UniProt:Q2GA55 (Saro_0823), a homologue of Thermotoga maritima TM1668, UniProt:Q9X1Z6., The homologue in Caulobacter crescentus (CC1388), UniProt:Q9A8G6, is associated with CspD, a cold shock protein (CC1387), UniProt:Q9A8G7. However, the genomic context of UniProt:Q2GA55 is most conserved with a putative xylose isomerase, suggesting a possible role in extracellular sugar processing. Saro_0821, UniProt:Q2GA57, is annotated as an AMP-dependent synthetase and ligase. PDB:3m7a structure corresponds to the C-terminal (27-165) fragment of the YP_496102 (Saro_0823) protein and it is structurally unique, as the best hits from Dali have a Z-score of 3.8 (1nt0, 2j1t, 3kq4) and it is thus a likely candidate for a new fold. Interestingly, many of the top Dali hits are involved in sugar metabolism. There are no obvious active site-like cavities on the protein surface of 3m7a (http://www.topsan.org/Proteins/JCSG/).", "487": "PF07261\nReplication initiation and membrane attachment\nThis family consists of several bacterial replication initiation and membrane attachment (DnaB) proteins, as well as DnaD which is a component of the PriA primosome. The PriA primosome functions to recruit the replication fork helicase onto the DNA [PMID:11679082]. The DnaB protein is essential for both replication initiation and membrane attachment of the origin region of the chromosome and plasmid pUB110 in Bacillus subtilis. It is known that there are two different classes (DnaBI and DnaBII) in the DnaB mutants; DnaBI is essential for both chromosome and pUB110 replication, whereas DnaBII is necessary only for chromosome replication [PMID:3027697]. DnaD has been merged into this family. This family also includes Ftn6, a cyanobacterial-specific divisome component possibly playing a role at the interface between DNA replication and cell division [PMID:19698108]. Ftn6 possesses a conserved domain localised within the N-terminus of the proteins. This domain, named FND, exhibits sequence and structure similarities with the DnaD-like domains Pfam:PF04271 now merged into Pfam:PF07261.", "488": "PF01765\nRibosome recycling factor\nThe ribosome recycling factor (RRF / ribosome release factor) dissociates the ribosome from the mRNA after termination of translation, and is essential bacterial growth [PMID:8183897]. Thus ribosomes are \"recycled\" and ready for another round of protein synthesis.", "489": "PF03645\nTctex-1 family\nTctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. ", "490": "PF03618\nKinase/pyrophosphorylase\nThis family of regulatory proteins has ADP-dependent kinase and inorganic phosphate-dependent pyrophosphorylase activity [1-3].", "491": "PF13682\nChemoreceptor zinc-binding domain\nThe chemoreceptor zinc-binding domain (CZB) is found in bacterial signal transduction proteins - most frequently receptors involved in chemotaxis and motility, but also in c-di-GMP signalling and nitrate/nitrite-sensing. Originally discovered in the cytoplasmic chemoreceptor TlpD from Helicobacter pylori, it is often found C-terminal to the MCPsignal domain in cytoplasmic chemoreceptor proteins. The CZB domain contains a core sequence motif, Hxx[WFYL]x21-28Cx[LFMVI]Gx[WFLVI]x18-27HxxxH. The highly-conserved H-C-H-H residues of this motif are believed to coordinate zinc; mutating the latter two histidines of the motif to alanines abolishes Zn binding. This domain binds zinc with high affinity, with a Kd in the femtomolar range. This domain has been shown in E. coli to be a zinc sensor that regulates the catalytic activity of Pfam:PF00990 [PMID:23769666]. This domain also binds the chemoattractant HOCl at a site very close to that of zinc. It has been shown that zinc participates in HOCl sensing by forming a redox 'Cys-Zn switch' that reacts towards HOCl (Matilla et. al.,FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).", "492": "PF14416\nPMR5 N terminal Domain\nThe plant family with PMR5, ESK1, TBL3 etc have a N-terminal C rich predicted sugar binding domain followed by the PC-Esterase (acyl esterase) domain [PMID:20056006].", "493": "PF02020\neIF4-gamma/eIF5/eIF2-epsilon\nThis domain of unknown function is found at the C-terminus of several translation initiation factors [PMID:8520487].", "494": "PF10646\nSporulation and spore germination\nThe GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 Pfam:PF01520, Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold.", "495": "ATP phosphoribosyltransferase", "496": "PF01052\nType III flagellar switch regulator (C-ring) FliN C-term\nThis family includes the C-terminal region of flagellar motor switch proteins FliN and FliM. It is associated with family FliM, Pfam:PF02154 and family FliN_N Pfam:PF16973.", "497": "PF18803\nCxC2 like cysteine cluster associated with KDZ transposases\nA predicted Zinc chelating domain present N-terminal to the KDZ transposase domain [PMID:24398522].", "498": "PF05598\nTransposase domain (DUF772)\nThis presumed domain is found at the N-terminus of many proteins found in transposons.", "499": "Choline/Carnitine o-acyltransferase", "500": "PF02308\nMgtC family\nThe MgtC protein is found in an operon with the Mg2+ transporter protein MgtB. The function of MgtC and its homologues is not known.", "501": "PF04536\nTPM domain\nThis family was first named TPM domain after its founding proteins: TLP18.3, Psb32 and MOLO-1. In Arabidopsis, this domain is called the thylakoid acid phosphatase -TAP - domain and has a Rossmann-like fold [PMID:21908686]. In plants, the family resides in the thylakoid lumen attached to the outer membrane of the chloroplast/plastid. It is active in the photosystem II [PMID:17576201, PMID:21653280].", "502": "PF01817\nChorismate mutase type II\nChorismate mutase EC:5.4.99.5 catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine [PMID:9642265, PMID:9497350].", "503": "PF07479\nNAD-dependent glycerol-3-phosphate dehydrogenase C-terminus\nNAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyses the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the C-terminal substrate-binding domain [PMID:10801498].", "504": "PF06541\nPutative ABC-transporter type IV\nCmpB is a family of membrane proteins that are likely to be part of a two-component type IV ABC-transporter system. Families can transport multiple drugs including ethidium and fluoroquinolones. UniProtKB:Q83XH0 is a member of TCDB family 3.A.1.121.4.", "505": "PF03352\nMethyladenine glycosylase\nThe DNA-3-methyladenine glycosylase I is constitutively expressed and is specific for the alkylated 3-methyladenine DNA. ", "506": "PF01992\nATP synthase (C/AC39) subunit\nThis family includes the AC39 subunit from vacuolar ATP synthase Swiss:P32366 [PMID:8509410], and the C subunit from archaebacterial ATP synthase [PMID:8702544]. The family also includes subunit C from the Sodium transporting ATP synthase from Enterococcus hirae Swiss:P43456 [PMID:8157629].", "507": "PF03481\nThreonylcarbamoyl-AMP synthase, C-terminal domain\nThis domain can be found in the C terminus of threonylcarbamoyl-AMP synthases, including Sua5 from Saccharomyces cerevisiae and YwlC from Bacillus subtilis. Threonylcarbamoyl-AMP synthase is required for the formation of a threonylcarbamoyl group on adenosine at position 37 (t6A37) in tRNAs that read codons beginning with adenine [PMID:19287007, PMID:23072323]. This domain adopts the Rossmann fold and may be involved in GTP and/or tRNA binding based on structural similarity with both GTP and tRNA binding proteins [PMID:18004774].", "508": "PF03030\nInorganic H+ pyrophosphatase\nThe H+ pyrophosphatase is an transmembrane proton pump involved in establishing the H+ electrochemical potential difference between the vacuole lumen and the cell cytosol. Vacuolar-type H(+)-translocating inorganic pyrophosphatases have long been considered to be restricted to plants and to a few species of photo-trophic bacteria. However, in recent investigations, these pyrophosphatases have been found in organisms as disparate as thermophilic Archaea and parasitic protists [PMID:11335173].", "509": "Eukaryotic initiation factor 4E", "510": "PF03626\nProkaryotic Cytochrome C oxidase subunit IV\nCytochrome c oxidase (COX) is a multi-subunit enzyme complex that catalyses the final step of electron transfer through the respiratory chain on the mitochondrial inner membrane. This family is composed of cytochrome c oxidase subunit 4 from prokaryotes.", "511": "PF05235\nCHAD domain\nThe CHAD domain is an alpha-helical domain functionally associated with the Pfam:PF01928 domains. It has conserved histidines that may chelate metals [PMID:12456267].", "512": "PF09371\nTex-like protein N-terminal domain\nThis presumed domain is found at the N-terminus of Swiss:Q45388. This protein defines a novel family of prokaryotic transcriptional accessory factors [PMID:8755871].", "513": "PF01311\nBacterial export proteins, family 1\nThis family includes the following members; FliR, MopE, SsaT, YopT, Hrp, HrcT and SpaR All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways [PMID:7814323].", "514": "Phosphofructokinase", "515": "PF16193\nAAA C-terminal domain\nAAA_assoc_2 is found at the C-terminus of a relatively small set of AAA domains in proteins ranging from archaeal to fungi, plants and mammals.", "516": "PF04087\nDomain of unknown function (DUF389)\nFamily of hypothetical bacterial proteins with an undetermined function.", "517": "PF06050\n2-hydroxyglutaryl-CoA dehydratase, D-component\nDegradation of glutamate via the hydroxyglutarate pathway involves the syn-elimination of water from 2-hydroxyglutaryl-CoA. This anaerobic process is catalysed by 2-hydroxyglutaryl-CoA dehydratase, an enzyme with two components (A and D) that reversibly associate during reaction cycles. This component contains one non-reducible [4Fe-4S]2+ cluster and a reduced riboflavin 5'-monophosphate [PMID:11980491].", "518": "PF09285\nElongation factor P, C-terminal\nMembers of this family of nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology [PMID:15210970].", "519": "PF02517\nType II CAAX prenyl endopeptidase Rce1-like\nThis family (also known as the ABI (abortive infection) family) contains putative IMPs and has homologues in all three domains of life, including Rce1 from S. cerevisiae [PMID:20154137]. Rce1 is a type II CAAX prenyl protease that processes all farnesylated and geranylgeranylated CAAX proteins. It is an integral membrane endoprotease localized to the endoplasmic reticulum that mediates the cleavage of the carboxyl-terminal three amino acids from CaaX proteins. It is involved in processing the Ras family of small GTPases, the gamma-subunit of heterotrimeric GTPases, nuclear lamins, and protein kinases and phosphatases [PMID:29424242]. Three residues of S. cerevisiae Rce1 -E156, H194 and H248- are critical for catalysis [PMID:16361710]. The structure of Rce1 from the archaea Methanococcus (MmRce1) suggests that this group of proteins represents a novel IMP (intramembrane protease) family, the glutamate IMPs [PMID:24291792]. There is a conserved sequence motif EExxxR.", "520": "PF10442\nFIST C domain\nThe FIST C domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids [PMID:17855421].", "521": "PF12878\nSICA extracellular beta domain\nThe SICA (schizont-infected cell agglutination) proteins of P. knowlesi, one of the variant antigen gene families, are associated with parasitic virulence. These proteins are comprised of multiple domains, with the extracellular domains occurring at different frequencies. There can be between 1 and 10 copies of this cysteine-rich domain [PMID:18843368].", "522": "PF00472\nRF-1 domain\nThis domain is found in peptide chain release factors such as RF-1 (Swiss:P07011) and RF-2 (Swiss:P07012), and a number of smaller proteins of unknown function such as Swiss:P40711. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis.", "523": "PF12704\nMacB-like periplasmic core domain\nThis family represents the periplasmic core domain found in a variety of ABC transporters. The structure of this family has been solved for the MacB protein [PMID:19432486]. Some structural similarity was found to the periplasmic domain of the AcrB multidrug efflux transporter.", "524": "PF00104\nLigand-binding domain of nuclear hormone receptor\nThis all helical domain is involved in binding the hormone in these receptors.", "525": "PF10557\nCullin protein neddylation domain\nThis is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue [PMID:15021886].", "526": "PF16188\nC-terminal region of peptidase_M24\nThis is a short region at the C-terminus of a number of metallo-peptidases of the M24 family.", "527": "Sigma-70 factor, region 1.2", "528": "PF12686\nProtein of unknown function (DUF3800)\nThis family of proteins is functionally uncharacterised. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 215 and 302 amino acids in length. There is a DE motif at the N-terminus and a QXXD motif at the C-terminus that may be functionally important.", "529": "PF03140\nPlant protein of unknown function\nThe function of the plant proteins constituting this family is unknown.", "530": "PF01450\nAcetohydroxy acid isomeroreductase, catalytic domain\nAcetohydroxy acid isomeroreductase catalyses the conversion of acetohydroxy acids into dihydroxy valerates. This reaction is the second in the synthetic pathway of the essential branched side chain amino acids valine and isoleucine.", "531": "PF08516\nADAM cysteine-rich\nADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity [PMID:12460986].", "532": "PF04366\nLas17-binding protein actin regulator\nYsc84 is a family of Las17-binding proteins found in metazoa. Together, Las17 and Ysc84 are essential for proper polymerisation of actin; Ysc84 is able to bind to and stabilise the actin dimer presented by Las17 and thereby promote polymerisation. An active actin cytoskeleton is necessary for adequate endocytosis. (Pfam:PF00018), or a FYVE zinc finger (Pfam:PF01363).", "533": "PF01820\nD-ala D-ala ligase N-terminus\nThis family represents the N-terminal region of the D-alanine--D-alanine ligase enzyme EC:6.3.2.4 which is thought to be involved in substrate binding [PMID:10908650]. D-Alanine is one of the central molecules of the cross-linking step of peptidoglycan assembly. There are three enzymes involved in the D-alanine branch of peptidoglycan biosynthesis: the pyridoxal phosphate-dependent D-alanine racemase (Alr), the ATP-dependent D-alanine:D-alanine ligase (Ddl), and the ATP-dependent D-alanine:D-alanine-adding enzyme (MurF) [PMID:12499203]. This domain is structurally related to the PreATP-grasp domain.", "534": "PF00560\nLeucine Rich Repeat\nCAUTION: This Pfam may not find all Leucine Rich Repeats in a protein. Leucine Rich Repeats are short sequence motifs present in a number of proteins with diverse functions and cellular locations. These repeats are usually involved in protein-protein interactions. Each Leucine Rich Repeat is composed of a beta-alpha unit. These units form elongated non-globular structures. Leucine Rich Repeats are often flanked by cysteine rich domains.", "535": "PF06477\nProtein of unknown function (DUF1091)\nThis is a family of uncharacterised proteins. Based on its distant similarity to Pfam:PF02221 and conserved pattern of cysteine residues it is possible that these domains are also lipid binding.", "536": "PF04092\nSRS domain\nToxoplasma gondii is a persistent protozoan parasite capable of infecting almost any warm-blooded vertebrate. The surface of Toxoplasma is coated with a family of developmentally regulated glycosylphosphatidylinositol (GPI)-linked proteins (SRSs), of which SAG1 is the prototypic member. SRS proteins mediate attachment to host cells and interface with the host immune response to regulate the virulence of the parasite. SAG1 is composed of two disulphide linked SRS domains. These have 6 cysteines that form 1-6,2-5 and 3-4 pairings. The structure of the immunodominant SAG1 antigen reveals a homodimeric configuration [PMID:12091874]. The SRS domain is found in a single copy in the SAG2 proteins. This family of surface antigens are found in other apicomplexans.", "537": "PF10509\nGalactokinase galactose-binding signature\nThis is the highly conserved galactokinase signature sequence which appears to be present in all galactokinases irrespective of how many other ATP binding sites, etc that they carry [PMID:10359639]. The function of this domain appears to be to bind galactose [PMID:12796487], and the domain is normally at the N-terminus of the enzymes, EC:2.7.1.6 [PMID:15526155]. This domain is associated with the families GHMP_kinases_C, Pfam:PF08544 and GHMP_kinases_N, Pfam:PF00288.", "538": "PF06968\nBiotin and Thiamin Synthesis associated domain\nBiotin synthase (BioB), EC:2.8.1.6 , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer [PMID:12482614]. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this family) and form a heterodimer[PMID:12650933]. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers [PMID:12482614, PMID:12650933]. This domain therefore may be involved in co-factor binding or dimerisation (Finn, RD personal observation).", "539": "PF16886\nATPsynthase alpha/beta subunit N-term extension\nATP-synt_ab_Xtn is an extension of the alpha-beta catalytic subunit of VATA or V-type proton ATPase catalytic subunit at the N-terminal end. It is found from bacteria to humans, and was not modelled in family ATP-synt_ab, Pfam:PF00006.", "540": "NADH-ubiquinone/plastoquinone oxidoreductase chain 6", "541": "PF13556\nPucR C-terminal helix-turn-helix domain\nThis helix-turn-helix domain is often found at the C-terminus of PucR-like transcriptional regulators such as Swiss:O32138 and is likely to be DNA-binding.", "542": "PF07907\nYibE/F-like protein\nThe sequences featured in this family are similar to two proteins expressed by Lactococcus lactis, YibE (Swiss:Q9CHC5) and YibF (Swiss:Q9CHC4). Most of the members of this family are annotated as being putative membrane proteins, and in fact the sequences contain a high proportion of hydrophobic residues.", "543": "PF03331\nUDP-3-O-acyl N-acetylglycosamine deacetylase\nThe enzymes in this family catalyse the second step in the biosynthetic pathway for lipid A.", "544": "PF02415\nChlamydia polymorphic membrane protein (Chlamydia_PMP) repeat\nThis family contains several Chlamydia polymorphic membrane proteins. Chlamydia pneumoniae is an obligate intracellular bacterium and a common human pathogen causing infection of the upper and lower respiratory tract. Common for the Pmps are the tetrapeptide GGA(I/V/L) motif repeated several times in the N-terminal part. The C-terminal half is characterised by conserved tryptophans and a carboxy-terminal phenylalanine. A signal peptide leader sequence is predicted in 20 C. pneumoniae Pmps, which indicates an outer membrane localisation. Pmp10 and Pmp11 contain a signal peptidase II cleavage site suggesting lipid modification. The C. pneumoniae pmp genes represent 17.5% of the chlamydia-specific coding capacity and they are all transcribed during chlamydial growth but the function of Pmps remains unknown [PMID:11583841]. This family shows some similarity to Pfam:PF05594 and hence is likely to also form a beta-helical structure (personal obs:C Yeats).", "545": "PF04314\nCopper chaperone PCu(A)C\nPCu(A)C is a periplasmic copper chaperone. Its role may be to capture and transfer copper to two other copper chaperones, PrrC and Cox11, which in turn deliver Cu(I) to cytochrome c oxidase [PMID:22248670]. ", "546": "Uncharacterised ACR, YggU family COG1872", "547": "PF01706\nFliG C-terminal domain\nFliG is a component of the flageller rotor, present in about 25 copies per flagellum. This domain functions specifically in motor rotation.", "548": "PF14226\nnon-haem dioxygenase in morphine synthesis N-terminal\nThis is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity.", "549": "PF02773\nS-adenosylmethionine synthetase, C-terminal domain\nThe three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.", "550": "PF01169\nUncharacterized protein family UPF0016\nThis family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. It has been suggested that these proteins are calcium transporters [PMID:24955841].", "551": "PF08711\nTFIIS helical bundle-like domain\nMediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function [PMID:18050436]. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. Notably, the 'small' and 'large' Mediator complexes differ in their subunit composition: the Med26 subunit preferentially associates with the small, active complex, whereas cdk8, cyclin C, Med12 and Med13 associate with the large Mediator complex [PMID:18418385]. This family includesthe C terminal region of a number of eukaryotic hypothetical proteins which are homologous to the Saccharomyces cerevisiae protein IWS1. IWS1 is known to be an Pol II transcription elongation factor and interacts with Spt6 and Spt5 [PMID:12242279, PMID:12556496].", "552": "PF04241\nProtein of unknown function (DUF423)\nThis family of proteins with unknown function is a possible integral membrane protein from Caenorhabditis elegans. This family of proteins has GO references indicating the protein is involved in nematode larval development and is a positive regulator of growth rate.", "553": "Thaumatin family", "554": "PF04294\nVanW like protein\nFamily members include vancomycin resistance protein W (VanW). Genes encoding members of this family have been found in vancomycin resistance gene clusters vanB [PMID:11376048] and vanG [PMID:11036060]. The function of VanW is unknown.", "555": "Sec1 family", "556": "ATP synthase subunit C", "557": "PF17137\nDomain of unknown function (DUF5110)\nThis domain is likely to be a carbohydrate-binding domain of some description as it is found immediately C-terminal to the glycosyl-hydrolase family Glyco_hydro_31, Pfam:PF01055.", "558": "HisG, C-terminal domain", "559": "PF02397\nBacterial sugar transferase\nThis Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways.", "560": "PF02671\nPaired amphipathic helix repeat\nThis family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene Swiss:P22579 (also known as SDI1) that is a negative regulator of the yeast HO gene [PMID:2233725]. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions.", "561": "PF04002\nRadC-like JAB domain\nA family of proteins present widely across the bacteria. This family was named initially with reference to the E. coli radC102 mutation which suggested that RadC was involved in repair of DNA lesions [PMID:10224240]. However the relevant mutation has subsequently been shown to be in recG, where radC is in fact an allele of recG [PMID:11053371]. In addition, a personal communication from Claverys, J-P, et al, indicates a total failure of all attempts to characterise a radiation-related function for RadC in Streptococcus pneumoniae, suggesting that it is not involved in repair of DNA lesions, in recombination during transformation, in gene conversion, nor in mismatch repair. Computational analysis, however, provides a possible function. The RadC-like family belong to the JAB superfamily of metalloproteins [PMID:21890906]. The domain shows fusions to an N-terminal Helix-hairpin-Helix (HhH) domain in most instances. Other domain combinations include fusions to the anti-restriction module ArdC, the DinG/RAD3-like superfamily II helicases and the DNAG-like primase. In some bacteria, closely related DinG/Rad3- like superfamily II helicases are fused to a 3'-5' exonuclease in the same position as the RadC-like JAB domain. These conserved domain associations lead to the hypothesis that the RadC-like JAB domains might function as a nuclease [PMID:21890906].", "562": "PF01384\nPhosphate transporter family\nThis family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter [PMID:7732001]. This family also contains the leukaemia virus receptor Swiss:Q08344.", "563": "PF05402\nCoenzyme PQQ synthesis protein D (PqqD)\nThis family contains several bacterial coenzyme PQQ synthesis protein D (PqqD) sequences. This protein is required for coenzyme pyrrolo-quinoline-quinone (PQQ) biosynthesis [PMID:8002620, PMID:12437981].", "564": "PF13012\nMaintenance of mitochondrial structure and function\nThis is C-terminal to the Mov24 region of the yeast proteasomal subunit Rpn11 and seems likely to regulate the mitochondrial fission and tubulation processes, ie the outer mitochondrial membrane proteins. This function appears to be unrelated to the proteasome activity of the N-terminal region [PMID:18172023].", "565": "PF01888\nCbiD\nCbiD is essential for cobalamin biosynthesis in both S. typhimurium and B. megaterium, no functional role has been ascribed to the protein. The CbiD protein has a putative S-AdoMet binding site. It is possible that CbiD might have the same role as CobF in undertaking the C-1 methylation and deacylation reactions required during the ring contraction process [PMID:9742225].", "566": "PF01055\nGlycosyl hydrolases family 31\nGlycosyl hydrolases are key enzymes of carbohydrate metabolism. Family 31 comprises of enzymes that are, or similar to, alpha- galactosidases.", "567": "PF03150\nDi-haem cytochrome c peroxidase\nThis is a family of distinct cytochrome c peroxidases (CCPs) that contain two haem groups. Similar to other cytochrome c peroxidases, they reduce hydrogen peroxide to water using c-type haem as an oxidisable substrate. However, since they possess two, instead of one, haem prosthetic groups, bacterial CCPs reduce hydrogen peroxide without the need to generate semi-stable free radicals. The two haem groups have significantly different redox potentials. The high potential (+320 mV) haem feeds electrons from electron shuttle proteins to the low potential (-330 mV) haem, where peroxide is reduced (indeed, the low potential site is known as the peroxidatic site) [PMID:8591033]. The CCP protein itself is structured into two domains, each containing one c-type haem group, with a calcium-binding site at the domain interface. This family also includes MauG proteins, whose similarity to di-haem CCP was previously recognised [PMID:9202457].", "568": "PF04127\nDNA / pantothenate metabolism flavoprotein\nThe DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism.", "569": "PF07264\nEtoposide-induced protein 2.4 (EI24)\nThis family contains a number of eukaryotic etoposide-induced 2.4 (EI24) proteins approximately 350 residues long as well as bacterial CysZ proteins (formerly known as DUF540). In cells treated with the cytotoxic drug etoposide, EI24 is induced by p53 [PMID:8649819]. It has been suggested to play an important role in negative cell growth control [PMID:10594026].", "570": "PF00177\nRibosomal protein S7p/S5e\nThis family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes.", "571": "PF16325\nPeptidase family U32 C-terminal domain\nThis domain is found at the C-terminus of many members of Peptidase family U32 (Pfam:PF01136).", "572": "PF17827\nPrmC N-terminal domain\nThis entry corresponds to the N-terminal alpha helical domain of the HemK protein. HemK is a methyltransferase enzyme that carries out the methylation of the N5 nitrogen of the glutamine found in the conserved GGQ motif of class-1 release factors [PMID:12741815].", "573": "PF13490\nPutative zinc-finger\nThis is a putative zinc-finger found in some anti-sigma factor proteins.", "574": "Ribosomal protein L19", "575": "PF12697\nAlpha/beta hydrolase family\nThis family contains alpha/beta hydrolase enzymes of diverse specificity.", "576": "PF18199\nDynein heavy chain C-terminal domain\nThis family represents the C-terminal domain of dynein heavy chain. This domain is a complex structure comprising six alpha-helices and an incomplete six-stranded antiparallel beta-barrel. The shape of this domain is distinctively flat, spreading over the AAA1, AAA5 and AAA6 domain [PMID:22398446].", "577": "PF07593\nASPIC and UnbV\nThis conserved sequence is found associated with Pfam:PF00515 in several paralogous proteins in Rhodopirellula baltica. It is also found associated with Pfam:PF01839 in several eukaryotic integrin-like proteins (e.g. human ASPIC Swiss:Q9NQ78) and in several other bacterial proteins (e.g. Swiss:Q84HN1 [PMID:12536216]).", "578": "PF17768\nRecJ OB domain\nThis OB-fold is found in RecJ proteins where is binds to ssDNA [PMID:27058167].", "579": "PF02021\nUncharacterised protein family UPF0102\nThe function of this family is unknown.", "580": "PF14497\nGlutathione S-transferase, C-terminal domain\nThis domain is closely related to Pfam:PF00043.", "581": "PF02575\nYbaB/EbfC DNA-binding family\nThis is a family of DNA-binding proteins. Members of this family form homodimers which bind DNA via a tweezer-like structure [1-3]. The conformation of the DNA is changed when bound to these proteins [PMID:19208644]. In bacteria, these proteins may play a role in DNA replication-recovery following DNA damage [PMID:12486730].", "582": "PF00646\nF-box domain\nThis domain is approximately 50 amino acids long, and is usually found in the N-terminal half of a variety of proteins. Two motifs that are commonly found associated with the F-box domain are the leucine rich repeats (LRRs; Pfam:PF00560 and Pfam:PF07723) and the WD repeat (Pfam:PF00400). The F-box domain has a role in mediating protein-protein interactions in a variety of contexts, such as polyubiquitination, transcription elongation, centromere binding and translational repression [1-2].", "583": "PF03561\nAllantoicase repeat\nThis family is found in pairs in Allantoicases, forming the majority of the protein. These proteins allow the use of purines as secondary nitrogen sources in nitrogen-limiting conditions through the reaction: allantoate + H(2)0 = (-)-ureidoglycolate + urea.", "584": "PF10590\nPyridoxine 5'-phosphate oxidase C-terminal dimerisation region\nThis domain represents one of the two dimerisation regions of the protein, located at the edge of the dimer interface, at the C-terminus, being the last three beta strands, S6, S7, and S8 along with the last three residues to the end. In Swiss:P21159, S6 runs from residues 178-192, S7 from 200-206 and S8 from 211-215. the extended loop, of residues 167-177 may well be involved in the pocket formed between the two dimers that positions the FMN molecule [PMID:10903950].To date, the only time functional oxidase or phenazine biosynthesis activities have been experimentally demonstrated is when the sequences contain both Pfam:PF01243 and Pfam:PF10590. It is unknown the role performed by each domain in bringing about molecular functions of either oxidase or phenazine activity [PMID:26327315].", "585": "PF05504\nSpore germination B3/ GerAC like, C-terminal\nThe GerAC protein of the Bacillus subtilis spore is required for the germination response to L-alanine. Members of this family are thought to be located in the inner spore membrane. Although the function of this family is unclear, they are likely to encode the components of the germination apparatus that respond directly to this germinant, mediating the spore's response [PMID:11418573].", "586": "PF12396\nProtein of unknown function (DUF3659)\nThis domain family is found in bacteria and eukaryotes, and is approximately 70 amino acids in length.", "587": "PF01725\nHam1 family\nThis family consists of the HAM1 protein Swiss:P47119 and hypothetical archaeal bacterial and C. elegans proteins. HAM1 controls 6-N-hydroxylaminopurine (HAP) sensitivity and mutagenesis in S. cerevisiae Swiss:P47119 [PMID:8789257]. The HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions [PMID:8789257].", "588": "PF01502\nPhosphoribosyl-AMP cyclohydrolase\nThis enzyme catalyses the third step in the histidine biosynthetic pathway. It requires Zn ions for activity.", "589": "HupF/HypC family", "590": "PF00576\nHIUase/Transthyretin family\nThis family includes transthyretin that is a thyroid hormone-binding protein that transports thyroxine from the bloodstream to the brain. However, most of the sequences listed in this family do not bind thyroid hormones. They are actually enzymes of the purine catabolism that catalyse the conversion of 5-hydroxyisourate (HIU) to OHCU [PMID:16462750, PMID:16098976]. HIU hydrolysis is the original function of the family and is conserved from bacteria to mammals; transthyretins arose by gene duplications in the vertebrate lineage [PMID:16952372]. HIUases are distinguished in the alignment from the conserved C-terminal YRGS sequence.", "591": "PF13493\nDomain of unknown function (DUF4118)\nThis domain is found in a wide variety of bacterial signalling proteins. It is likely to be a transmembrane domain involved in ligand sensing.", "592": "PF12781\nATP-binding dynein motor region\nThis domain is found in human cytoplasmic dynein-2 proteins. Cytoplasmic dynein-2 (dynein-2) performs intraflagellar transport and is associated with human skeletal ciliopathies. Dyneins share a conserved motor domain that couples cycles of ATP hydrolysis with conformational changes to produce movement. Structural analysis reveal that the motor's ring consists of six AAA+ domains (ATPases associated with various cellular activities (AAA1-AAA6). This is the fifth AAA+ domain subdomain AAA5S. Structural analysis reveal that it is the coiled-coil buttress interface. The relative movement of AAA5S together with the stalk (AAA4S), is coupled to rearrangements in the AAA+ ring. Closure of the AAA1 site and the rigid body movement of AAA2-AAA4 force the AAA4/AAA5 interface to close and the AAA6L subdomain to rotate towards the ring centre. The AAA5S subdomain rotates as a unit together with AAA6L, and this movement pulls the buttress relative to the stalk [PMID:25470043].", "593": "PF13089\nPolyphosphate kinase N-terminal domain\nPolyphosphate kinase (Ppk) catalyses the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules.", "594": "PF01155\nHydrogenase/urease nickel incorporation, metallochaperone, hypA\nHypA is a metallochaperone that binds nickel to bring it safely to its target. The targets for Hypa are the nickel-containing enzymes [Ni,Fe]-hydrogenase and urease. The nickel coordinates with four nitrogens within the protein. The four conserved cysteines towards the C-terminus bind one zinc moiety probably to stabilise the protein fold [PMID:19621959].", "595": "PF04264\nYceI-like domain\nE. coli YceI is a base-induced periplasmic protein [PMID:12107143]. The recent structure of a member of this family shows that it binds to poly-isoprenoid [PMID:15741337]. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold.", "596": "PF16320\nRibosomal protein L7/L12 dimerisation domain\nThis is the N-terminal dimerisation domain of ribosomal protein L7/L12 [PMID:10637222].", "597": "PF12680\nSnoaL-like domain\nThis family contains a large number of proteins that share the SnoaL fold.", "598": "PF07963\nProkaryotic N-terminal methylation motif\nThis short motif directs methylation of the conserved phenylalanine residue. It is most often found at the N-terminus of pilins and other proteins involved in secretion, see Pfam:PF00114, Pfam:PF05946, Pfam:PF02501 and Pfam:PF07596.", "599": "PF00271\nHelicase conserved C-terminal domain\nThe Prosite family is restricted to DEAD/H helicases, whereas this domain family is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.", "600": "PF10825\nProtein of unknown function (DUF2752)\nThis family is conserved in bacteria. Many members are annotated as being putative membrane proteins.", "601": "ATP synthase A chain", "602": "Protein kinase C terminal domain", "603": "PF03473\nMOSC domain\nThe MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms such as Swiss:P32157, or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters.", "604": "Thiamine pyrophosphate enzyme, C-terminal TPP binding domain", "605": "PF03746\nLamB/YcsF family\nThis family includes LamB. The lam locus of Aspergillus nidulans consists of two divergently transcribed genes, lamA and lamB, involved in the utilisation of lactams such as 2-pyrrolidinone. Both genes are under the control of the positive regulatory gene amdR and are subject to carbon and nitrogen metabolite repression [PMID:1729609]. The exact molecular function of the proteins in this family is unknown.", "606": "PF02446\n4-alpha-glucanotransferase\nThese enzymes EC:2.4.1.25 transfer a segment of a (1,4)-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or (1,4)-alpha-D-glucan [PMID:7678257].", "607": "PF01313\nBacterial export proteins, family 3\nThis family includes the following members; FliQ, MopD, HrcS, Hrp, YopS and SpaQ All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways [PMID:7814323].", "608": "PF03861\nANTAR domain\nANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil. ", "609": "PF02934\nGatB/GatE catalytic domain\nThis domain is found in the GatB and GatE proteins [PMID:16216574].", "610": "TRCF domain", "611": "PF01330\nRuvA N terminal domain\nThe N terminal domain of RuvA has an OB-fold structure. This domain forms the RuvA tetramer contacts [PMID:8832889].", "612": "Ribosome-binding factor A", "613": "Fibronectin type II domain", "614": "PF04299\nPutative FMN-binding domain\nIn Bacillus subtilis, family member Swiss:P21341 (PAI 2/ORF-2) was found to be essential for growth [PMID:2108124]. The SUPERFAMILY database finds that this domain is related to FMN-binding domains, suggesting this protein is also FMN-binding.", "615": "Ribosomal protein L11, RNA binding domain", "616": "PF02669\nK+-transporting ATPase, c chain\nThis family consists of K+-transporting ATPase, c chain, KdpC. KdpC forms strong interactions with the KdpA subunit, serving to assemble and stabilise the Kdp complex [PMID:9858692]. It has been suggested that KdpC could be one of the connecting links between the energy providing subunit KdpB and the K+-transporting subunit KdpA [PMID:9858692]. The K+ transport system actively transports K+ ions via ATP hydrolysis.", "617": "PF01924\nHydrogenase formation hypA family\nHypD is involved in hydrogenase formation. It contains many possible metal binding residues, which may bind to nickel. Transposon Tn5 insertions into hypD resulted in R. leguminosarum mutants that lacked any hydrogenase activity in symbiosis with peas [PMID:8326860].", "618": "PF07549\nSecD/SecF GG Motif\nThis family consists of various prokaryotic SecD and SecF protein export membrane proteins. This SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC [PMID:9694879]. SecD and SecF are required to maintain a proton motive force [PMID:8112309]. This alignment encompasses a -GG- motif typically found in N-terminal half of the SecD/SecF proteins .", "619": "PF04051\nTransport protein particle (TRAPP) component\nTRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterised TRAPP proteins and has a dimeric structure [PMID:15608655] with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi [PMID:15608655].", "620": "PF17941\nPolyphosphate kinase C-terminal domain 1\nPolyphosphate kinase (Ppk) catalyses the formation of polyphosphate from ATP, with chain lengths of up to a thousand or more orthophosphate molecules. This C1-terminal domain has a structure similar to phospholipase D. It is one of two closely related carboxy-terminal domains (C1 and C2 domains). Both the C1 and C2 domains (residues 322-502 and 503-687, respectively) consist of a sevenstranded mixed beta-sheet flanked by five alpha-helices. However, the structural topology and relative orientations of the helices to the beta-sheet in these two domains are different. The C1 and C2 domains are highly conserved in the PPK family. Some of the residues previously shown to be crucial for the enzyme catalytic activity are located in these two domains [PMID:15947782].", "621": "PF05635\n23S rRNA-intervening sequence protein\nThis family consists of bacterial proteins encoded within an intervening sequence present within some 23S rRNA genes [1-3]. It folds into an anti-parallel four-helix bundle and forms homopentamers [PMID:16948161].", "622": "PF02595\nGlycerate kinase family\nThis is family of Glycerate kinases.", "623": "PF03462\nPCRF domain\nThis domain is found in peptide chain release factors.", "624": "PF12661\nHuman growth factor-like EGF\nhEGF, or human growth factor-like EGF, domains have six conserved residues disulfide-bonded into the characteristic 'ababcc' pattern. They are involved in growth and proliferation of cells, in proteins of the Notch/Delta pathway, neurogulin and selectins. hEGFs are also found in mosaic proteins with four-disulfide laminin EGFs such as aggrecan and perlecan. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal Cys residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In hEGFs the C-terminal thiol resides in the beta-turn, resulting in shorter loop-lengths between the Cys residues of disulfide 'c', typically C[8-9]XC. These shorter loop-lengths are also typical of the four-disulfide EGF domains, laminin ad integrin. Tandem hEGF domains have six linking residues between terminal cysteines of adjacent domains. hEGF domains may or may not bind calcium in the linker region. hEGF domains with the consensus motif CXD4X[F,Y]XCXC are hydroxylated exclusively in the Asp residue.", "625": "PF01590\nGAF domain\nThis domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyse ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalysed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyses the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54. This domain can bind biliverdine and phycocyanobilin (Matilla et al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).", "626": "PF04463\n2-thiouracil desulfurase\nThis family of proteins, predominantly found in Bacteria, are involved in the desulfuration of 2-thiouracil into uracil in the 2-thiouridine degradation pathway. It has been demonstrated that these proteins contain a Fe-S cluster required for their activity [PMID:29194984].", "627": "PF02677\nEpoxyqueuosine reductase QueH\nThe reduction of epoxyqueuosine (oQ) is the last step in the synthesis of the tRNA modification queuosine (Q). members of this family were predicted to encode for an alternative epoxyqueuosine reductase. Furthermore, it has been suggested that family members are a non-orthologous replacement of queG, responsible for oQ to Q conversion. QueH contains conserved cysteines that could be involved in the coordination of a Fe/S center in a similar fashion to what has been identified in QueG. No cobalamin was identified associated with recombinant QueH protein, indicating that the reduction activity is independent from cobalamin [PMID:28128549].", "628": "PF10415\nFumarase C C-terminus\nFumarase C catalyses the stereo-specific interconversion of fumarate to L-malate as part of the Kreb's cycle. The full-length protein forms a tetramer with visible globular shape. FumaraseC_C is the C-terminal 65 residues referred to as domain 3. The core of the molecule consists of a bundle of 20 alpha-helices from the five-helix bundle of domain 2. The projections from the core of the tetramer are generated from domains 1 and 3 of each subunit [PMID:8909293]. FumaraseC_C does not appear to be part of either the active site or the activation site but is helical in structure forming a little bundle.", "629": "PF07719\nTetratricopeptide repeat\nThis Pfam entry includes outlying Tetratricopeptide-like repeats (TPR) that are not matched by Pfam:PF00515.", "630": "Porphobilinogen deaminase, dipyromethane cofactor binding domain", "631": "PF14237\nGYF domain 2\nThis domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. It contains an evolutionary conserved signature W-X-Y-X6-11-GPF-X4-M-X2-W-X3-GYF, the site of interaction with proline-rich peptides. Family members include RME-8 (Required for receptor-mediated endocytosis 8), a DNAJC13 protein. RME-8 was first identified as a protein that is required for endocytosis in Caenorhabditis elegans. It coordinates the activity of the WASH complex with the function of the retromer SNX dimer to control endosomal tubulation [PMID:24643499]. Family members found in Arabidopsis include Arabidopsis trithorax-related3 (Atxr3), also known as set domain group 2 (Sdg2). It is the major enzyme responsible for H3K4me3 in Arabidopsis and SDG2-dependent H3K4m3 is critical for regulating gene expression and plant development [PMID:20937886]. Another family member found in Arabidopsis is Tic56. It is an essential subunit of a 1-MDa protein complex at the inner chloroplast envelope membrane [PMID:28125316]. Furthermore, Tic56 is important for rRNA processing and chloroplast ribosome assembly [PMID:27733515].", "632": "PF13368\nTopoisomerase C-terminal repeat\nThis domain is repeated up to five times to form the C-terminal region of bacterial topoisomerase immediately downstream of the zinc-finger motif.", "633": "PF01546\nPeptidase family M20/M25/M40\nThis family includes a range of zinc metallopeptidases belonging to several families in the peptidase classification [PMID:7674922]. Family M20 are Glutamate carboxypeptidases. Peptidase family M25 contains X-His dipeptidases.", "634": "PF09754\nPAC2 family\nThis PAC2 (Proteasome assembly chaperone) family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 247 and 307 amino acids in length. These proteins function as a chaperone for the 26S proteasome. The 26S proteasome mediates ubiquitin-dependent proteolysis in eukaryotic cells. A number of studies including very recent ones have revealed that assembly of its 20S catalytic core particle is an ordered process that involves several conserved proteasome assembly chaperones (PACs). Two heterodimeric chaperones, PAC1-PAC2 and PAC3-PAC4, promote the assembly of rings composed of seven alpha subunits [PMID:18786393].", "635": "PF03120\nNAD-dependent DNA ligase OB-fold domain\nDNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor [PMID:10698952]. This family is a small domain found after the adenylation domain Pfam:PF01653 in NAD dependent ligases [PMID:10698952]. OB-fold domains generally are involved in nucleic acid binding. ", "636": "EF-hand domain", "637": "Respiratory-chain NADH dehydrogenase, 30 Kd subunit", "638": "PF02603\nHPr Serine kinase N terminus\nThis family represents the N-terminal region of Hpr Serine/threonine kinase PtsK. This kinase is the sensor in a multicomponent phospho-relay system in control of carbon catabolic repression in bacteria [PMID:9570401]. This kinase in unusual in that it recognises the tertiary structure of its target and is a member of a novel family unrelated to any previously described protein phosphorylating enzymes [PMID:9570401]. X-ray analysis of the full-length crystalline enzyme from Staphylococcus xylosus at a resolution of 1.95 A shows the enzyme to consist of two clearly separated domains that are assembled in a hexameric structure resembling a three-bladed propeller. The blades are formed by two N-terminal domains each, and the compact central hub assembles the C-terminal kinase domains [PMID:11904409].", "639": "PF14842\nFliG N-terminal domain\nThis is the N-terminal domain of the flagellar rotor protein FliG [PMID:20676082].", "640": "PF01424\nR3H domain\nThe name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. The function of the domain is predicted to be binding ssDNA.", "641": "Elongation factor P (EF-P) OB domain", "642": "PF13399\nLytR cell envelope-related transcriptional attenuator\nThis family appears at the C-terminus of members of the LytR_cpsA_psr, Pfam:PF03816, family", "643": "PF01977\n3-octaprenyl-4-hydroxybenzoate carboxy-lyase\nThis family has been characterised as 3-octaprenyl-4- hydroxybenzoate carboxy-lyase enzymes [PMID:782527]. This enzyme catalyses the third reaction in ubiquinone biosynthesis. For optimal activity the carboxy-lase was shown to require Mn2+ [PMID:782527].", "644": "PF00438\nS-adenosylmethionine synthetase, N-terminal domain\nThe three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.", "645": "PF02542\nYgbB family\nThe ygbB protein is a putative enzyme of deoxy-xylulose pathway (terpenoid biosynthesis) [PMID:10694574].", "646": "PF03367\nZPR1 zinc-finger domain\nThe zinc-finger protein ZPR1 is ubiquitous among eukaryotes. It is indeed known to be an essential protein in yeast. In quiescent cells, ZPR1 is localised to the cytoplasm. But in proliferating cells treated with EGF or with other mitogens, ZPR1 accumulates in the nucleolus. ZPR1 interacts with the cytoplasmic domain of the inactive EGF receptor (EGFR) and is thought to inhibit the basal protein tyrosine kinase activity of EGFR. This interaction is disrupted when cells are treated with EGF, though by themselves, inactive EGFRs are not sufficient to sequester ZPR1 to the cytoplasm [PMID:9852145, PMID:9763455, PMID:8650580]. Upon stimulation by EGF, ZPR1 directly binds the eukaryotic translation elongation factor-1alpha (eEF-1alpha) to form ZPR1/eEF-1alpha complexes [PMID:9852145]. These move into the nucleus, localising particularly at the nucleolus. Indeed, the interaction between ZPR1 and eEF-1alpha has been shown to be essential for normal cellular proliferation [PMID:9852145], and ZPR1 is thought to be involved in pre-ribosomal RNA expression [PMID:9763455]. The ZPR1 domain consists of an elongation initiation factor 2-like zinc finger and a double-stranded beta helix with a helical hairpin insertion. ZPR1 binds preferentially to GDP-bound eEF1A but does not directly influence the kinetics of nucleotide exchange or GTP hydrolysis [PMID:17704259]. The alignment for this family shows a domain of which there are two copies in ZPR1 proteins. This family also includes several hypothetical archaeal proteins (from both Crenarchaeota and Euryarchaeota), which only contain one copy of the aligned region. This similarity between ZPR1 and archaeal proteins was not previously noted.", "647": "PF03381\nLEM3 (ligand-effect modulator 3) family / CDC50 family\nMembers of this family have been predicted to contain transmembrane helices. The family member LEM3 (Swiss:P42838) is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells [PMID:11063677]. The products of genes YNR048w (Swiss:P53740), YNL323w (Swiss:P42838) and YCR094w (Swiss:P25656) (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39 [PMID:11180453]. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure [PMID:8428577]. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway [PMID:2099190]. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants [PMID:11180453] may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39.", "648": "PF12002\nMgsA AAA+ ATPase C terminal\nThe MgsA protein possesses DNA-dependent ATPase and ssDNA annealing activities [PMID:15743409]. MgsA contributes to the recovery of stalled replication forks and therefore prevents genomic instability caused by aberrant DNA replication [PMID:15743409]. Additionally, MgsA may play a role in chromosomal segregation [PMID:15743409]. This is consistent with a report that MgsA co-localises with the replisome and affects chromosome segregation [PMID:15743409]. This domain represents the C terminal region of MgsA.", "649": "PF05336\nL-rhamnose mutarotase\nThis family contains L-rhamnose mutarotase which is a glycosyl hydrolase that converts the monosaccharide L-rhamnopyranose from the alpha to the beta stereoisomer. In Escherichia coli this enzyme is the product of the rhaM gene (also known as yiiL). The tertiary structure has been solved, in complex with L-rhamnose, and the catalytic mechanism determined. His22 is the proton donor. The enzyme naturally exists as a dimer.", "650": "PF02626\nCarboxyltransferase domain, subdomain A and B\nUrea carboxylase (UC) catalyses a two-step, ATP- and biotin-dependent carboxylation reaction of urea. It is composed of biotin carboxylase (BC), carboxyltransferase (CT), and biotin carboxyl carrier protein (BCCP) domains. The CT domain of UC consists of four subdomains, named A, B, C and D. This domain covers the A and B subdomains of the CT domain. This domain covers the whole length of KipA (kinase A) from Bacillus subtilis [PMID:9334321]. It can also be found in S. cerevisiae urea amidolyase Dur1,2, which is a multifunctional biotin-dependent enzyme with domains for urea carboxylase and allophanate (urea carboxylate) hydrolase activity[PMID:20884691]. ", "651": "PF16355\nDomain of unknown function (DUF4982)\nThis family is found in the C-terminal of uncharacterized proteins and beta-galactosidases around 680 residues in length from various Bacteroides species. The function of this protein is unknown.", "652": "PF03595\nVoltage-dependent anion channel\nThis family of transporters has ten alpha helical transmembrane segments [PMID:20981093]. The structure of a bacterial homologue of SLAC1 shows it to have a trimeric arrangement. The pore is composed of five helices with a conserved Phe residue involved in gating. One homologue, Mae1 from the yeast Schizosaccharomyces pombe, functions as a malate uptake transporter; another, Ssu1 from Saccharomyces cerevisiae and other fungi including Aspergillus fumigatus, is characterised as a sulfite efflux pump; and TehA from Escherichia coli is identified as a tellurite resistance protein by virtue of its association in the tehA/tehB operon. In plants, this family is found in the stomatal guard cells functioning as an anion-transporting pore [PMID:18305484]. Many homologues are incorrectly annotated as tellurite resistance or dicarboxylate transporter (TDT) proteins.", "653": "Ribosomal protein L23", "654": "Class II release factor RF3, C-terminal domain", "655": "PF02820\nmbt repeat\nThe function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein Swiss:Q9VHA0. The repeat is found in up to four copies as in Swiss:Q9UHJ3. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function.", "656": "Imidazoleglycerol-phosphate dehydratase", "657": "PF00003\n7 transmembrane sweet-taste receptor of 3 GCPR\nThis is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G Pfam:PF07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor Pfam:PF01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness [PMID:16076846].", "658": "PEP-utilising enzyme, N-terminal", "659": "PF08495\nFIST N domain\nThe FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids [PMID:17855421].", "660": "PF11987\nTranslation-initiation factor 2\nIF-2 is a translation initiator in each of the three main phylogenetic domains (Eukaryotes [PMID:17086204], Bacteria [PMID:10878130] and Archaea [PMID:16169924]). IF2 interacts with formylmethionine-tRNA, GTP, IF1, IF3 and both ribosomal subunits [PMID:10878130]. Through these interactions, IF2 promotes the binding of the initiator tRNA to the A site in the smaller ribosomal subunit and catalyses the hydrolysis of GTP following initiation-complex formation [PMID:10878130].", "661": "PF09347\nDomain of unknown function (DUF1989)\nThis family of proteins are functionally uncharacterised.", "662": "PF04168\nA predicted alpha-helical domain with a conserved ER motif.\nAn uncharacterized alpha helical domain containing a highly conserved ER motif and typically found as a tandem duplication. Contextual analysis suggests that it functions in a distinct peptide synthesis/modification system comprising of a transglutaminase, a peptidase of the NTN-hydrolase superfamily, an active and inactive circularly permuted ATP-grasp domains and a transglutaminase fused N-terminal to a circularly permuted COOH-NH2 ligase domain [PMID:20023723].", "663": "TNFR/NGFR cysteine-rich region", "664": "PF02508\nRnf-Nqr subunit, membrane protein\nThis is a family of integral membrane proteins including Rhodobacter-specific nitrogen fixation (rnf) proteins RnfA and RnfE [PMID:9492268] and Na+-translocating NADH:ubiquinone oxidoreductase (Na+-NQR) subunits NqrD and NqrE. ", "665": "PF17146\nPIN domain of ribonuclease\nThis is a PIN domain found in eukaryotic ribonuclease Nob1 and archaeal ribonuclease VapC1 [PMID:22156373]. Budding yeast Nob1 is involved in proteasomal and 40S ribosomal subunit biogenesis [PMID:10675611]. VapC1 is a toxic component and a ribonuclease of a toxin-antitoxin (TA) module [PMID:25391136]. PIN domains are small protein domains identified by the presence of three strictly conserved acidic residues. Apart from these three residues, there is poor sequence conservation [PMID:21036780]. PIN domains are found in eukaryotes, eubacteria and archaea. In eukaryotes they are ribonucleases involved in nonsense mediated mRNA decay [PMID:17053788] and in processing of 18S ribosomal RNA [PMID:19706509]. In prokaryotes, they are the toxic components of toxin-antitoxin (TA) systems, their toxicity arising by virtue of their ribonuclease activity. The PIN domain TA systems are now called VapBC TAs(virulence associated proteins), where VapB is the inhibitor and VapC, the PIN-domain ribonuclease toxin [PMID:21036780].", "666": "PF02578\nMulti-copper polyphenol oxidoreductase laccase\nLaccases are multi-copper oxidoreductases able to oxidise a wide variety of phenolic and non-phenolic compounds and are widely distributed among both prokaryotes and eukaryotes. There are two main active catalytic sites with conserved histidines that are capable of binding four copper atoms [PMID:16740638].", "667": "PF00828\nRibosomal proteins 50S-L15, 50S-L18e, 60S-L27A\nThis family includes higher eukaryotic ribosomal 60S L27A, archaeal 50S L18e, prokaryotic 50S L15, fungal mitochondrial L10, plant L27A, mitochondrial L15 and chloroplast L18-3 proteins.", "668": "PF02016\nLD-carboxypeptidase N-terminal domain\nMuramoyl-tetrapeptide carboxypeptidase hydrolyses a peptide bond between a di-basic amino acid and the C-terminal D-alanine in the tetrapeptide moiety in peptidoglycan. This cleaves the bond between an L- and a D-amino acid. The function of this activity is in murein recycling. This family also includes the microcin c7 self-immunity protein Swiss:Q47511. This family corresponds to Merops family S66.", "669": "PF00221\nAromatic amino acid lyase\nThis family includes proteins with phenylalanine ammonia-lyase, EC:4.3.1.24, histidine ammonia-lyase, EC:4.3.1.3, and tyrosine aminomutase, EC:5.4.3.6, activities [1-3].", "670": "PF14748\nPyrroline-5-carboxylate reductase dimerisation\nPyrroline-5-carboxylate reductase consists of two domains, an N-terminal catalytic domain (Pfam:PF03807) and a C-terminal dimerisation domain. This is the dimerisation domain [PMID:16233902].", "671": "Uncharacterized ACR, COG1678", "672": "PF04472\nCell division protein SepF\nSepF accumulates at the cell division site in an FtsZ-dependent manner and is required for proper septum formation [PMID:16420366]. Mutants are viable but the formation of the septum is much slower and occurs with a very abnormal morphology. This family also includes archaeal related proteins of unknown function.", "673": "PF14310\nFibronectin type III-like domain\nThis domain has a fibronectin type III-like structure [PMID:20138890]. It is often found in association with Pfam:PF00933 and Pfam:PF01915. Its function is unknown.", "674": "PF18072\nFormylglycinamide ribonucleotide amidotransferase linker domain\nThis is the linker domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT), also known as Phosphoribosylformylglycinamidine synthase (EC:6.3.5.3), PurL and formylglycinamidine ribonucleotide (FGAM) synthase. This enzyme catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway. The structure analysis of Salmonella typhimurium FGAR-AT reveals that this linker domain is made up of a long hydrophilic belt with an extended conformation [PMID:15301531].", "675": "PF01458\nSUF system FeS cluster assembly, SufBD\nIron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S] [PMID:16221578]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems. The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA [PMID:17350000]. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. SufA is homologous to IscA [PMID:15278785], acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets. This entry represents SufB and SufD proteins, which are homologous, and form part of the SufBCD complex in the SUF system [PMID:26472926]. SufB accepts sulfur transferred from SufE [PMID:17350958], whereas SufD may play a role in iron acquisition [PMID:20857974].", "676": "PF03636\nGlycosyl hydrolase family 65, N-terminal domain\nThis family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyses the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. This domain is believed to be essential for catalytic activity [PMID:11587643] although its precise function remains unknown.", "677": "Inosine-uridine preferring nucleoside hydrolase", "678": "PF00684\nDnaJ central domain\nThe central cysteine-rich (CR) domain of DnaJ proteins contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DNAJ cysteine rich domain and various hydrophobic peptides has been found [PMID:10891270].", "679": "Tetratricopeptide repeat", "680": "PF04172\nLrgB-like family\nThe two products of the lrgAB operon are potential membrane proteins, and LrgA and LrgB are both thought to control of murein hydrolase activity and penicillin tolerance [PMID:10714982].", "681": "PF07873\nYabP family\nThis family of proteins is involved in spore coat assembly during the process of sporulation [PMID:15231775].", "682": "PF10410\nDnaB-helicase binding domain of primase\nThis domain is the C-terminal region three-helical domain of primase [PMID:10741967]. Primases synthesise short RNA strands on single-stranded DNA templates, thereby generating the hybrid duplexes required for the initiation of synthesis by DNA polymerases. Primases are recruited to single-stranded DNA by helicases, and this domain is the region of the primase which binds DnaB-helicase [PMID:10873470]. It is associated with the Toprim domain (Pfam:PF01751) which is the central catalytic core.", "683": "PF08439\nOligopeptidase F\nThis domain is found to the N-terminus of the Pfam:PF01432 domain in bacterial and archaeal proteins including Oligoendopeptidase F. An example of this protein is Lactococcus lactis PepF [PMID:7798200].", "684": "PF02245\nMethylpurine-DNA glycosylase (MPG)\nMethylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA.", "685": "PF16870\n2-oxoglutarate dehydrogenase C-terminal\nOxoGdeHyase_C is a family found immediately C-terminal to Transket_pyr, Pfam:PF02779. It is found at the C-terminus of 2-oxoglutarate dehydrogenase.", "686": "PF02650\nWhiA C-terminal HTH domain\nThis domain is found at the C-terminus of the sporulation regulator WhiA. It is predicted to form a DNA-binding helix-turn-helix structure [PMID:17603302]. The WhiA protein also contains two N-terminal domains that are distant homologues of LAGLIDADG homing endonucleases [PMID:17603302].", "687": "PF01967\nMoaC family\nMembers of this family are involved in molybdenum cofactor biosynthesis. However their molecular function is not known.", "688": "PF01887\nSAM hydroxide adenosyltransferase N-terminal domain\nThis is a family of proteins, previously known as DUF62, found in archaebacteria and bacteria. The structure of proteins in this family is similar to that of a bacterial fluorinating enzyme [PMID:17910070]. S-adenosyl-l-methionine hydroxide adenosyltransferases utilises a rigorously conserved amino acid side chain triad (Asp-Arg-His) which may have a role in activating water to hydroxide ion [PMID:18675376]. This family used to be known as DUF62.", "689": "HSF-type DNA-binding", "690": "PF02873\nUDP-N-acetylenolpyruvoylglucosamine reductase, C-terminal domain\nMembers of this family are UDP-N-acetylenolpyruvoylglucosamine reductase enzymes EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan.", "691": "PF01523\nPmbA/TldA metallopeptidase domain 1\nThis entry represents a group of metalloproteases. The tertiary structure of the Escherichia coli TdlD/TdlE complex has been solved, and shows that the TdlD subunit is the active peptidase, binding a single zinc ion at an HEXXXH motif in which the glutamic acid is a substrate-binding residue and the two histidines are zinc ligands. The third zinc ligand is a cysteine, C-terminal to the HEXXXH motif. The TldE (also known as PmbA) by itself has no catalytic activity, does not bind zinc, and does not carry the HEXXXH motif [PMID:28943336]. TldD and TldE were originally identified as regulators of DNA gyrase. Later, they are shown to be metalloproteases involved in CcdA degradation [2-3].", "692": "PF13449\nEsterase-like activity of phytase\nThis is a repeated domain that carries several highly conserved Glu and Asp residues indicating the likelihood that the domain incorporates the enzymic activity of the PLC-like phospho-diesterase part of the proteins. ", "693": "Ribosomal protein L7/L12 C-terminal domain", "694": "PF16192\nC-terminal four TMM region of protein-O-mannosyltransferase\nPMT_4TMC is the C-terminal four membrane-pass region of protein-O-mannosyltransferases and similar enzymes.", "695": "PF01981\nPeptidyl-tRNA hydrolase PTH2\nPeptidyl-tRNA hydrolases are enzymes that release tRNAs from peptidyl-tRNA during translation.", "696": "PF13649\nMethyltransferase domain\nThis family appears to be a methyltransferase domain.", "697": "PF00926\n3,4-dihydroxy-2-butanone 4-phosphate synthase\n3,4-Dihydroxy-2-butanone 4-phosphate is biosynthesised from ribulose 5-phosphate and serves as the biosynthetic precursor for the xylene ring of riboflavin. Sometimes found as a bifunctional enzyme with Pfam:PF00925.", "698": "PF18741\nREase_MTES_1575\nVsr REase Fold. Fused to HEPN (SWT1/Abi2 family), along with Transglutaminase and wHTH [PMID:23768067].", "699": "PF14841\nFliG middle domain\nThis is the middle domain of the flagellar rotor protein FliG [1-2].", "700": "PF02201\nSWIB/MDM2 domain\nThis family includes the SWIB domain and the MDM2 domain [PMID:12016060]. The p53-associated protein (MDM2) is an inhibitor of the p53 tumour suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2 [PMID:8875929].", "701": "PF03928\nHaem degrading protein HbpS-like\nThis entry includes haem degrading protein HbpS from Streptomyces reticuli (swiss:Q9RIM2) and and GlcG from Escherichia coli [PMID:8606183]. HbpS is up-regulated in response to haemin- and peroxide-based oxidative stress. It interacts with the SenS/SenR two-component signal transduction system. Iron binds to surface-exposed lysine residues of an octomeric assembly of the protein [PMID:19244623]. The structure of GlcG is composed of an alpha-beta(2)-alpha(3)-beta(2)-alpha fold, similar to the Roadblock/LC7 domain.", "702": "PF06133\nControl of competence regulator ComK, YlbF/YmcA\nYlbF Is a family of short Gram-positive and archaeal proteins that includes both YlbF and YmcA which may interact synergistically. The family is necessary for correct biofilm formation, as null mutants of ymcA and ylbF fail to form pellicles at air-liquid interfaces and grow on solid media as smooth, undifferentiated colonies. During development, YmcA, YlbF and YaaT, family PSPI, Pfam:PF04468, interact directly with one another forming a stable ternary complex, in vitro. All three proteins are required for competence, sporulation and the formation of biofilms. The YmcA-YlbF-YaaT complex affects the phosphotransfer between Spo0F and Spo0B, thus accelerating the production of Spo0A~P. The three processes of biofilm formation, mature spore formation and competence all require the active, phosphorylated form of Spo0A, as Spo0A-P [PMID:23490197, PMID:19202088].", "703": "Helix-turn-helix domain", "704": "PF01960\nArgJ family\nMembers of the ArgJ family catalyse the first EC:2.3.1.1 and fifth steps EC:2.3.1.35 in arginine biosynthesis. ", "705": "PF09479\nListeria-Bacteroides repeat domain (List_Bact_rpt)\nThis model describes a conserved core region of about 43 residues [PMID:21345802] which occurs in more than 400 mostly secreted or cell surface-located proteins from over 300 eubacterial and few archaeal species. These proteins contain between one and 34 copies of the domain that often occur in tandem arrays with up to more than 20 copies [PMID:27789707].", "706": "PF04403\nParaquat-inducible protein A\nParaquat is a superoxide radical-generating agent. The promoter for the pqiA gene is also inducible by other known superoxide generators [PMID:7751275]. This is predicted to be a family of integral membrane proteins, possibly located in the inner membrane. This family is related to NADH dehydrogenase subunit 2 (Pfam:PF00361).", "707": "PF04085\nrod shape-determining protein MreC\nMreC (murein formation C) is involved in the rod shape determination in E. coli, and more generally in cell shape determination of bacteria whether or not they are rod-shaped.", "708": "PF03483\nB3/4 domain\nThis domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins.", "709": "PF14698\nArgininosuccinate lyase C-terminal\nThis domain is found at the C-terminus of argininosuccinate lyase [1-2].", "710": "PF01179\nCopper amine oxidase, enzyme domain\nCopper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyse the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme. ", "711": "PF17763\nGlutaminase/Asparaginase C-terminal domain\nThis domain is found at the C-terminus of asparaginase enzymes.", "712": "PF13167\nGTP-binding GTPase N-terminal\nThis is the N-terminal region of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This N-terminal region is necessary for stability of the whole protein.", "713": "Ribosomal L29 protein", "714": "Ribosomal protein S19", "715": "PF01871\nAMMECR1\nThis family consists of several AMMECR1 as well as several uncharacterised proteins. The contiguous gene deletion syndrome AMME is characterised by Alport syndrome, midface hypoplasia, mental retardation and elliptocytosis and is caused by a deletion in Xq22.3, comprising several genes including COL4A5, FACL4 and AMMECR1 [PMID:10828604]. This family contains sequences from several eukaryotic species as well as archaebacteria and it has been suggested that the AMMECR1 protein may have a basic cellular function, potentially in either the transcription, replication, repair or translation machinery [PMID:10049589].", "716": "PF17801\nAlpha galactosidase C-terminal beta sandwich domain\nThis domain is found at the C-terminus of alpha galactosidase enzymes.", "717": "7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase (HPPK)", "718": "PF06750\nBacterial Peptidase A24 N-terminal domain\nThis family is found at the N-terminus of the pre-pilin peptidases (Pfam:PF01478). It's function has not been specifically determined; however some of the family have been characterised as bifunctional ([PMID:8057924]), and this domain may contain the N-methylation activity (EC:2.1.1.-). It consists of an intracellular region between a pair of transmembrane. This region contains an invariant proline and two almost fully conserved disulphide bridges - hence the name DiS-P-DiS. The cysteines have been shown to be essential to the overall function of the enzyme in [PMID:8340405], but their role was incorrectly ascribed. ", "719": "PF02514\nCobN/Magnesium Chelatase\nThis family contains a domain common to the cobN protein and to magnesium protoporphyrin chelatase. CobN is implicated in the conversion of hydrogenobyrinic acid a,c-diamide to cobyrinic acid [PMID:1655697]. Magnesium protoporphyrin chelatase is involved in chlorophyll biosynthesis [PMID:8404842].", "720": "Creatinase/Prolidase N-terminal domain", "721": "PF00023\nAnkyrin repeat\nAnkyrins are multifunctional adaptors that link specific proteins to the membrane-associated, spectrin- actin cytoskeleton. This repeat-domain is a 'membrane-binding' domain of up to 24 repeated units, and it mediates most of the protein's binding activities. Repeats 13-24 are especially active, with known sites of interaction for the Na/K ATPase, Cl/HCO(3) anion exchanger, voltage-gated sodium channel, clathrin heavy chain and L1 family cell adhesion molecules. The ANK repeats are found to form a contiguous spiral stack such that ion transporters like the anion exchanger associate in a large central cavity formed by the ANK repeat spiral, while clathrin and cell adhesion molecules associate with specific regions outside this cavity [PMID:12456646][PMID:14731966].", "722": "PF04325\nProtein of unknown function (DUF465)\nFamily members are found in small bacterial proteins, and also in the heavy chains of eukaryotic myosin and kinesin, C terminal of the motor domain (Myosin Pfam:PF00063, Kinesin Pfam:PF00225). Members of this family may form coiled coil structures.", "723": "Endomembrane protein 70", "724": "UvrA DNA-binding domain", "725": "PF04551\nGcpE protein\nIn a variety of organisms, including plants and several eubacteria, isoprenoids are synthesised by the mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. Although different enzymes of this pathway have been described, the terminal biosynthetic steps of the MEP pathway have not been fully elucidated. GcpE gene of Escherichia coli is involved in this pathway [PMID:11274098].", "726": "Ubiquitin carboxyl-terminal hydrolase, family 1", "727": "PF02833\nDHHA2 domain\nThis domain is often found adjacent to the DHH domain Pfam:PF01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members [PMID:9478130]. The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus.", "728": "PF06949\nProtein of unknown function (DUF1292)\nThis family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown.", "729": "SURF1 family", "730": "PF13335\nMagnesium chelatase, subunit ChlI C-terminal\nThis is a family of the C-terminal of putative bacterial magnesium chelatase subunit ChlI proteins. Most members have the associated Pfam:PF01078.", "731": "PF07556\nProtein of unknown function (DUF1538)\nThis family contains several conserved glycines and phenylalanines.", "732": "PF09827\nCRISPR associated protein Cas2\nThis entry represents members of the family of Cas2, one of the first four protein families found to be associated with prokaryotic genomes containing multiple CRISPR elements. CRISPR systems protect against invasive nucleic acid sequences, including phage. Cas2 proteins have been characterised as either endoribonuclease (for ssRNA) or endodeoxyribonuclease (for dsDNA), depending on the system to which the Cas2 belongs [PMID:17379808, PMID:18482976]. The cas genes usually are found near the palindromic repeats. The structural subunit of Cas2, belongs to the VapD family of interferases. The interferase catalytic site is intact in the majority of the Cas2 proteins but is disrupted in some, and is not required for spacer acquisition [PMID:30905284, PMID:24793649]. This entry also includes the endoribonuclease VapD [PMID:22241770].", "733": "PF04376\nArginine-tRNA-protein transferase, N terminus\nThis family represents the N terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a de-stabilising amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified [PMID:9858543]. In S cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity [PMID:7495814]. Of these, only Cys 94 appears to be completely conserved in this family.", "734": "PF05195\nAminopeptidase P, N-terminal domain\nThis domain is structurally very similar [PMID:9520390] to the creatinase N-terminal domain (Pfam:PF01321). However, little or no sequence similarity exists between the two families.", "735": "PF00478\nIMP dehydrogenase / GMP reductase domain\nThis family is involved in biosynthesis of guanosine nucleotide. Members of this family contain a TIM barrel structure. In the inosine monophosphate dehydrogenases 2 CBS domains Pfam:PF00571 are inserted in the TIM barrel [PMID:10200156]. This family is a member of the common phosphate binding site TIM barrel family.", "736": "phosphoenolpyruvate-dependent sugar phosphotransferase system, EIIA 1", "737": "PF04145\nCtr copper transporter family\nThe redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport.", "738": "Ribosomal protein L35", "739": "PF13378\nEnolase C-terminal domain-like\nThis domain appears at the C-terminus of many of the proteins that carry the MR_MLE_N Pfam:PF02746 domain. EC:4.2.1.40.", "740": "PF07516\nSecA Wing and Scaffold domain\nSecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner. This family is composed of two C-terminal alpha helical subdomains: the wing and scaffold subdomains [PMID:12242434].", "741": "PF13660\nDomain of unknown function (DUF4147)\nThis domain is frequently found at the N-terminus of proteins carrying the glycerate kinase-like domain MOFRL, Pfam:PF05161.", "742": "PF01514\nSecretory protein of YscJ/FliF family\nThis family includes proteins that are related to the YscJ lipoprotein, and the amino terminus of FliF, the flageller M-ring protein. The members of the YscJ family are thought to be involved in secretion of several proteins. The FliF protein ring is thought to be part of the export apparatus for flageller proteins, based on the similarity to YscJ proteins [PMID:10049798].", "743": "PF17957\nBacterial Ig domain\nThis entry represents a bacterial ig-like domain that is found in glycosyl hydrolase enzymes.", "744": "PF01628\nHrcA protein C terminal domain\nHrcA is found to negatively regulate the transcription of heat shock genes [PMID:8576042, PMID:8606155]. HrcA contains an amino terminal helix-turn-helix domain, however this corresponds to the carboxy terminal domain.", "745": "PF14008\nIron/zinc purple acid phosphatase-like protein C\nThis domain is found at the C-terminus of Purple acid phosphatase proteins.", "746": "Ribosomal protein S16", "747": "PF03186\nCobD/Cbib protein\nThis family includes CobD proteins from a number of bacteria, in Salmonella this protein is called Cbib. Salmonella CobD is a different protein [PMID:7635831]. This protein is involved in cobalamin biosynthesis and is probably an enzyme responsible for the conversion of adenosylcobyric acid to adenosylcobinamide or adenosylcobinamide phosphate [PMID:7635831].", "748": "PF02742\nIron dependent repressor, metal binding and dimerisation domain\nThis family includes the Diphtheria toxin repressor [PMID:7568230]. It acts as an iron-binding repressor of diphtheria toxin gene expression and may serve as a global regulator of gene expression. DTXR comprises an N-terminal DNA-binding domain, an interface domain (which contains two metal-binding sites) and a third, very flexible C-terminal domain. The second domain is responsible for dimerization and metal binding. Binding of DTXR to Tox operator requires a divalent metal ion such as cobalt, ferric, manganese and nickel whereas zinc shows weak activation [PMID:7743135]. This domain can also bind Cd(II), Ca(II) and Cu(II) (Matilla et. al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).", "749": "PF04304\nProtein of unknown function (DUF454)\nPredicted membrane protein.", "750": "PF02561\nFlagellar protein FliS\nFliS is coded for by the FliD operon and is transcribed in conjunction with FliD and FliT, however this protein has no known function.", "751": "Ribosomal L27 protein", "752": "PF16875\nGlycosyl hydrolase family 36 N-terminal domain\nThis domain is found at the N-terminus of many family 36 glycoside hydrolases. It has a beta-supersandwich fold [PMID:23012371].", "753": "PF04020\nMycobacterial 4 TMS phage holin, superfamily IV\nThese proteins are predicted transmembrane proteins with probably four transmembrane spans. The 1.E.40 is represented by the mycobacterial 4 phage holin, but it also contains many cyanobacterial. proteobacterial and firmicute proteins. Holins are encoded within the genomes of Gram-positive and Gram-negative bacteria as well as in those of the bacteriophage of these organisms. The primary function of holins appears to be transport of murein hydrolases across the cytoplasmic membrane to the cell wall where these enzymes hydrolyse the cell wall polymer as a prelude to cell lysis. When chromosomally encoded the enzymes are therefore autolysins. Holins may also facilitate leakage of electrolytes and nutrients from the cell cytoplasm, thereby promoting cell death. Some may catalyse export of nucleases.", "754": "Nebulin repeat", "755": "PF01312\nFlhB HrpN YscU SpaS Family\nThis family includes the following members: FlhB, HrpN, YscU, SpaS, HrcU SsaU and YopU. All of these proteins export peptides using the type III secretion system. The peptides exported are quite diverse.", "756": "PF01923\nCobalamin adenosyltransferase\nCobalamin adenosyltransferase This family contains the gene products of PduO and EutT which are both cobalamin adenosyltransferases. PduO is a protein with ATP:cob(I)alamin adenosyltransferase activity. The main role of this protein is the conversion of inactive cobalamins to AdoCbl for 1,2-propanediol degradation [PMID:11160088].The EutT enzyme appears to be an adenosyl transferase, converting CNB12 to AdoB12 [PMID:10464203].", "757": "PF08331\nEpoxyqueuosine reductase QueG, DUF1730\nThis domain of unknown function occurs in Epoxyqueuosine reductase QueG, an iron-sulfur cluster-binding protein, together with the 4Fe-4S binding domain (Pfam:PF00037). QueG catalyses the conversion of epoxyqueuosine (oQ) to queuosine (Q), which is a hypermodified base found in the wobble positions of tRNA(Asp), tRNA(Asn), tRNA(His) and tRNA(Tyr) [PMID:21502530].", "758": "PF06628\nCatalase-related immune-responsive\nThis family represents a small conserved region within catalase enzymes (EC:1.11.1.6). All members also contain the Catalase family, Pfam:PF00199 domain. Catalase decomposes hydrogen peroxide into water and oxygen, serving to protect cells from its toxic effects [PMID:11351128]. This domain carries the immune-responsive amphipathic octa-peptide that is recognised by T cells [PMID:15585332].", "759": "PF01504\nPhosphatidylinositol-4-phosphate 5-Kinase\nThis family contains a region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in [PMID:9535851]. The family consists of various type I, II and III PIP5K enzymes. PIP5K catalyses the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signaling pathway.", "760": "60Kd inner membrane protein", "761": "PF03937\nFlavinator of succinate dehydrogenase\nThis family includes the highly conserved mitochondrial and bacterial proteins Sdh5/SDHAF2/SdhE. Both yeast and human Sdh5/SDHAF2 interact with the catalytic subunit of the succinate dehydrogenase (SDH) complex, a component of both the electron transport chain and the tricarboxylic acid cycle. Sdh5 is required for SDH-dependent respiration and for Sdh1 flavination (incorporation of the flavin adenine dinucleotide cofactor). Mutational inactivation of Sdh5 confers tumor susceptibility in humans [PMID:19628817]. Bacterial homologues of Sdh5, termed SdhE, are functionally conserved being required for the flavinylation of SdhA and succinate dehydrogenase activity. Like Sdh5, SdhE interacts with SdhA. Furthermore, SdhE was characterised as a FAD co-factor chaperone that directly binds FAD to facilitate the flavinylation of SdhA. Phylogenetic analysis demonstrates that SdhE/Sdh5 proteins evolved only once in an ancestral alpha-proteobacteria prior to the evolution of the mitochondria and now remain in subsequent descendants including eukaryotic mitochondria and the alpha, beta and gamma proteobacteria [PMID:22474332]. This family was previously annotated in Pfam as being a divergent TPR repeat but structural evidence has indicated this is not true. The E. coli protein, YgfY also acts as the antitoxin to the membrane-bound toxin family Cpta, Pfam:PF13166, whose E. coli member YgfX, expressed from the same operon as YgfY [PMID:22239607].", "762": "PF01825\nGPCR proteolysis site, GPS, motif\nThe GPS motif is found in GPCRs, and is the site for auto-proteolysis, so is thus named, GPS [PMID:9920906, PMID:9830014, PMID:10469603, PMID:17525154]. The GPS motif is a conserved sequence of ~40 amino acids containing canonical cysteine and tryptophan residues, and is the most highly conserved part of the domain. In most, if not all, cell-adhesion GPCRs these undergo autoproteolysis in the GPS between a conserved aliphatic residue (usually a leucine) and a threonine, serine, or cysteine residue [PMID:12270923]. In higher eukaryotes this motif is found embedded in the C-terminal beta-stranded part of a GAIN domain - GPCR-Autoproteolysis INducing (GAIN). The GAIN-GPS domain adopts a fold in which the GPS motif, at the C-terminus, forms five beta-strands that are tightly integrated into the overall GAIN domain. The GPS motif, evolutionarily conserved from tetrahymena to mammals, is the only extracellular domain shared by all human cell-adhesion GPCRs and PKD proteins, and is the locus of multiple human disease mutations. The GAIN-GPS domain is both necessary and sufficient functionally for autoproteolysis, suggesting an autoproteolytic mechanism whereby the overall GAIN domain fine-tunes the chemical environment in the GPS to catalyse peptide bond hydrolysis [PMID:22333914]. In the cell-adhesion GPCRs and PKD proteins, the GPS motif is always located at the end of their long N-terminal extracellular regions, immediately before the first transmembrane helix of the respective protein.", "763": "PF01346\nDomain amino terminal to FKBP-type peptidyl-prolyl isomerase\nThis family is only found at the amino terminus of Pfam:PF00254. This entry represents the N-terminal domain found in FKBP-type peptidylprolyl isomerases (PPIase). The N-terminal domain forms the dimer interface by the mutual exchange of two beta-strands between monomers [PMID:14672666].", "764": "PF12951\nPassenger-associated-transport-repeat\nThis Autotransporter-associated beta strand repeat model represents a core 32-residue region of a class of bacterial protein repeat found in one to 30 copies per protein. Most proteins with a copy of this repeat have domains associated with membrane autotransporters (Pfam:PF03797). The repeats occur with a periodicity of 60 to 100 residues. A pattern of sequence conservation is that every second residue is well-conserved across most of the domain. These repeats as likely to have a beta-helical structure. This repeat plays a role in the efficient transport of autotransporter virulence factors to the bacterial surface during growth and infection. The repeat is always associated with the passenger domain of the autotransporter. For these reasons it has been coined the Passenger-associated Transport Repeat (PATR) [PMID:25869731]. The mechanism by which the PATR motif promotes transport is uncertain but it is likely that the conserved glycines (see HMM Logo) are required for flexibility of folding and that this folding drives secretion [PMID:25869731]. Autotransporters that contain PATR(s) associate with distinct virulence traits such as subtilisin (S8) type protease domains and polymorphic outer-membrane protein repeats, whilst SPATE (S6) type protease and lipase-like autotransporters do not tend to contain PATR motifs [PMID:25869731].", "765": "PF11838\nERAP1-like C-terminal domain\nThis large domain is composed of 16 alpha helices organized as 8 HEAT-like repeats. This domain forms a concave face that faces towards the active site of the peptidase.", "766": "PF13676\nTIR domain\nThis is a family of Toll-like receptors.", "767": "Glucose-6-phosphate dehydrogenase, NAD binding domain", "768": "Ribosomal prokaryotic L21 protein", "769": "PF02633\nCreatinine amidohydrolase\nCreatinine amidohydrolase (EC:3.5.2.10), or creatininase, catalyses the hydrolysis of creatinine to creatine [PMID:7670196]. ", "770": "PF13975\ngag-polyprotein putative aspartyl protease\nThis family of putative aspartyl proteases is found pre-dominantly in retroviral proteins.", "771": "PF13640\n2OG-Fe(II) oxygenase superfamily\nThis family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily [PMID:11276424].", "772": "Forkhead domain", "773": "PF18076\nFormylglycinamide ribonucleotide amidotransferase N-terminal\nThis is the N-terminal domain found in Formylglycinamide ribonucleotide amidotransferase (FGAR-AT), also known as Phosphoribosylformylglycinamidine synthase (EC:6.3.5.3), PurL and formylglycinamidine ribonucleotide (FGAM) synthase. This enzyme catalyzes the ATP-dependent conversion of formylglycinamide ribonucleotide and glutamine to formylglycinamidine ribonucleotide, ADP, Pi, and glutamate in the fourth step of the purine biosynthetic pathway [PMID:24223728].", "774": "Preprotein translocase SecG subunit", "775": "PF00309\nSigma-54 factor, Activator interacting domain (AID)\nThe sigma-54 holoenzyme is an enhancer dependent form of the RNA polymerase. The AID is necessary for activator interaction [PMID:11544185]. In addition, the AID also inhibits transcription initiation in the sigma-54 holoenzyme prior to interaction with the activator [PMID:11544185].", "776": "PF06022\nPlasmodium variant antigen protein Cir/Yir/Bir\nThis family consists of several Cir, Yir and Bir proteins from the Plasmodium species P.chabaudi, P.yoelii and P.berghei.", "777": "PF02699\nPreprotein translocase subunit\nSee [PMID:11051763].", "778": "PF07494\nTwo component regulator propeller\nA large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to Pfam:PF01011 and Pfam:PF00400 indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.", "779": "PF04199\nPutative cyclase\nProteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site.", "780": "PF18758\nKyakuja-Dileera-Zisupton transposase\nA transposase family with an RNaseH catalytic domain, often fused to DNA binding domains such as SAP or cysteine cluster domains. KDZ transposases are widely present in fungi, metazoa, chlorophytes and haotpohytes. Fungal versions are often associated with a TET/JBP family of dioxygenases [PMID:24398522, PMID:21873630].", "781": "PF03975\nCheD chemotactic sensory transduction\nThis chemotaxis protein stimulates methylation of MCP proteins [PMID:8866475]. The chemotaxis machinery of Bacillus subtilis is similar to that of the well characterised system of Escherichia coli. However, B. subtilis contains several chemotaxis genes not found in the E. coli genome, such as CheC and CheD, indicating that the B. subtilis chemotactic system is more complex. CheD plays an important role in chemotactic sensory transduction for many organisms. CheD deamidates other B. subtilis chemoreceptors including McpB and McpC. Deamidation by CheD is required for B. subtilis chemoreceptors to effectively transduce signals to the CheA kinase [PMID:12011078]. The structure of a complex between the signal-terminating phosphatase, CheC, and the receptor-modifying deamidase, CheD, reveals how CheC mimics receptor substrates to inhibit CheD and how CheD stimulates CheC phosphatase activity. CheD resembles other cysteine deamidases from bacterial pathogens that inactivate host Rho-GTPases. Phospho-CheY, the intracellular signal and CheC target, stabilises the CheC-CheD complex and reduces availability of CheD [PMID:16469702]. A model is proposed whereby CheC acts as a CheY-P-induced regulator of CheD; CheY-P would cause CheC to sequester CheD from the chemoreceptors, inducing adaptation of the chemotaxis system [PMID:17908686].", "782": "PF01416\ntRNA pseudouridine synthase\nInvolved in the formation of pseudouridine at the anticodon stem and loop of transfer-RNAs Pseudouridine is an isomer of uridine (5-(beta-D-ribofuranosyl) uracil, and id the most abundant modified nucleoside found in all cellular RNAs. The TruA-like proteins also exhibit a conserved sequence with a strictly conserved aspartic acid, likely involved in catalysis.", "783": "PF05792\nCandida agglutinin-like (ALS)\nThis family consists of several agglutinin-like proteins from different Candida species. ALS genes of Candida albicans encode a family of cell-surface glycoproteins with a three-domain structure. Each Als protein has a relatively conserved N-terminal domain, a central domain consisting of a tandemly repeated motif of variable number, and a serine-threonine-rich C-terminal domain that is relatively variable across the family. The ALS family exhibits several types of variability that indicate the importance of considering strain and allelic differences when studying ALS genes and their encoded proteins [PMID:11124701]. Fungal adhesins, which include sexual agglutinins, virulence factors, and flocculins, are surface proteins that mediate cell-cell and cell-environment interactions. It is possible that both the serine/threonine-rich domain and the cysteine residues in the C-terminal and DIPSY Pfam:PF11763 participate in anchoring the terminal domains inside the wall, so that only the inner part of Map4p, including the repeat region, is sticking out as a fold-back loop then able to act in adhesing [PMID:17870620].", "784": "PF12392\nCollagenase\nThis domain family is found in bacteria, archaea and eukaryotes, and is approximately 120 amino acids in length. The family is found in association with Pfam:PF01136.", "785": "PF08529\nNusA N-terminal domain\nThis domain represents the RNA polymerase binding domain of NusA.", "786": "PF01899\nNa+/H+ ion antiporter subunit\nSubunit of a Na+/H+ Prokaryotic antiporter complex ([PMID:9852009],[PMID:9680201]).", "787": "PF01625\nPeptide methionine sulfoxide reductase\nThis enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine.", "788": "PF02417\nChromate transporter\nMembers of this family probably act as chromate transporters [PMID:2152903, PMID:2180932]. Members of this family are found in both bacteria and archaebacteria. The proteins are composed of one or two copies of this region. The alignment contains two conserved motifs, FGG and PGP.", "789": "PF07525\nSOCS box\nThe SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases.", "790": "PF05033\nPre-SET motif\nThis protein motif is a zinc binding motif [PMID:12389037]. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains.", "791": "PF10397\nAdenylosuccinate lyase C-terminus\nThis is the C-terminal seven alpha helices of the structure whose full length represents the enzyme adenylosuccinate lyase. This sequence lies C-terminal to the conserved motif necessary for beta-elimination reactions [PMID:9274883], Adenylosuccinate lyase catalyses two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide, the eighth step of the de novo pathway, and the formation of adenosine monophosphate (AMP) from adenylosuccinate, the second step in the conversion of inosine monophosphate into AMP [PMID:17485188].", "792": "PF03707\nBacterial signalling protein N terminal repeat\nFound as an N terminal triplet tandem repeat in bacterial signalling proteins. Family includes CoxC (Swiss:Q9KX27) and CoxH (Swiss:Q9KX23) from P.carboxydovorans. Each repeat contains two transmembrane helices. Domain is also described as the MHYT domain [PMID:11728710].", "793": "TAT (twin-arginine translocation) pathway signal sequence", "794": "PF08459\nUvrC RNAse H endonuclease domain\nThis domain is found in the C subunits of the bacterial and archaeal UvrABC system which catalyses nucleotide excision repair in a multi-step process. UvrC catalyses the first incision on the fourth or fifth phosphodiester bond 3' and on the eighth phosphodiester bond 5' from the damage that is to be excised [PMID:17245438]. The domain described here represents the RNAse H endonuclease domain, located at the C-terminal, between the UvrBC and the (HhH)2 domains, nearby the N-terminal of the HhH. Despite the lack of sequence homology, the endonuclease domain has an RNase H-like fold, which is characteristic of enzymes with nuclease or polynucleotide transferase activities. RNase H-related enzymes typically contain a highly conserved carboxylate triad, usually DDE, in their catalytic centre. However, instead of a third carboxylate, UvrC of Thermotoga maritima was found to contain a highly conserved histidine (H488) on helix-4 in close proximity to two aspartates [PMID:17245438].", "795": "PF08592\nAnthrone oxygenase\nThis family consists of anthrone oxygenases found in bacteria and fungi, and involved in the synthesis of different products. GedH from Aspergillus terreus is part of the gene cluster that mediates the biosynthesis of geodin [PMID:24009710], EncC from Neosartorya fumigata is involved in the biosynthesis of endocrocin [PMID:22492455], and MdpH from Aspergillus nidulans in the biosynthesis of monodictyphenone [PMID:21351751].", "796": "Citrate transporter", "797": "Aminoacyl tRNA synthetase class II, N-terminal domain", "798": "PF00174\nOxidoreductase molybdopterin binding domain\nThis domain is found in a variety of oxidoreductases. This domain binds to a molybdopterin cofactor. Xanthine dehydrogenases, that also bind molybdopterin, have essentially no similarity.", "799": "PF03625\nDomain of unknown function DUF302\nDomain is found in an undescribed set of proteins [PMID:12625841]. Normally occurs uniquely within a sequence, but is found as a tandem repeat (Swiss:Q9X8B8). Shows interesting phylogenetic distribution with majority of examples in bacteria and archaea, but it is also found in some fungal proteins. The hypothetical protein TT1751 from Thermus thermophilus has a beta-alpha-beta(4)-alpha structural fold [PMID:15481054].", "800": "PF12631\nMnmE helical domain\nThe tRNA modification GTPase MnmE consists of three domains. An N-terminal domain, a helical domain and a GTPase domain which is nested within the helical domain. This family represents the helical domain [1-2].", "801": "PF10601\nLITAF-like zinc ribbon domain\nMembers of this family display a conserved zinc ribbon structure [PMID:12527760] with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS)[PMID:17408970]. The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure [PMID:11731489].", "802": "PF01641\nSelR domain\nMethionine sulfoxide reduction is an important process, by which cells regulate biological processes and cope with oxidative stress. MsrA, a protein involved in the reduction of methionine sulfoxides in proteins, has been known for four decades and has been extensively characterised with respect to structure and function. However, recent studies revealed that MsrA is only specific for methionine-S-sulfoxides. Because oxidised methionines occur in a mixture of R and S isomers in vivo, it was unclear how stereo-specific MsrA could be responsible for the reduction of all protein methionine sulfoxides. It appears that a second methionine sulfoxide reductase, SelR , evolved that is specific for methionine-R-sulfoxides, the activity that is different but complementary to that of MsrA. Thus, these proteins, working together, could reduce both stereoisomers of methionine sulfoxide. This domain is found both in SelR proteins and fused with the peptide methionine sulfoxide reductase enzymatic domain Pfam:PF01625. The domain has two conserved cysteine and histidines. The domain binds both selenium and zinc [PMID:11929995]. The final cysteine is found to be replaced by the rare amino acid selenocysteine in some members of the family [PMID:10608886]. This family has methionine-R-sulfoxide reductase activity [PMID:11929995].", "803": "PF04032\nRNAse P Rpr2/Rpp21/SNM1 subunit domain\nThis family contains a ribonuclease P subunit of humans and yeast. Other members of the family include the probable archaeal homologues. This family includes SNM1 [PMID:10523674]. It is a subunit of RNase MRP (mitochondrial RNA processing), a ribonucleoprotein endoribonuclease that has roles in both mitochondrial DNA replication and nuclear 5.8S rRNA processing. SNM1 is an RNA binding protein that binds the MRP RNA specifically [PMID:10523674]. This subunit possibly binds the precursor tRNA [PMID:11497433].", "804": "PF04166\nPyridoxal phosphate biosynthetic protein PdxA\nIn Escherichia coli the coenzyme pyridoxal 5'-phosphate is synthesised de novo by a pathway that is thought to involve the condensation of 4-(phosphohydroxy)-L-threonine and 1-deoxy-D-xylulose, catalysed by the enzymes PdxA and PdxJ, to form either pyridoxine (vitamin B6) or pyridoxine 5'-phosphate [PMID:10225425].", "805": "PF11915\nProtein of unknown function (DUF3433)\nThis is a family of functionally uncharacterised proteins. The family is found in eukaryotes, and represents the conserved central region of the member proteins.", "806": "PF00584\nSecE/Sec61-gamma subunits of protein translocation complex\nSecE is part of the SecYEG complex in bacteria which translocates proteins from the cytoplasm. In eukaryotes the complex, made from Sec61-gamma and Sec61-alpha translocates protein from the cytoplasm to the ER. Archaea have a similar complex.", "807": "PF18052\nRx N-terminal domain\nThis entry represents the N-terminal domain found in many plant resistance proteins [PMID:24194517]. This domain has been predicted to be a coiled-coil, however the structure shows that it adopts a four helical bundle fold [PMID:24194517].", "808": "Ribonucleotide reductase, small chain", "809": "Glycosyltransferase Family 4", "810": "PF02261\nAspartate decarboxylase\nDecarboxylation of aspartate is the major route of beta-alanine production in bacteria, and is catalysed by the enzyme aspartate decarboxylase EC:4.1.1.11 which requires a pyruvoyl group for its activity. It is synthesised initially as a proenzyme which is then proteolytically cleaved to an alpha (C-terminal) and beta (N-terminal) subunit and a pyruvoyl group. This family contains both chains of aspartate decarboxylase.", "811": "PF07743\nHSCB C-terminal oligomerisation domain\nThis domain is the HSCB C-terminal oligomerisation domain and is found on co-chaperone proteins.", "812": "PF08376\nNitrate and nitrite sensing\nThe nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure [PMID:12633990].", "813": "Glu/Leu/Phe/Val dehydrogenase, dimerisation domain", "814": "PF02929\nBeta galactosidase small chain\nThis domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase.", "815": "PF03990\nG5-linked-Ubiquitin-like domain\nThis domain normally occurs as tandem repeats; however it is found as a single copy in the S. cerevisiae DNA-binding nuclear protein YCR593 (Swiss:P25357). This protein is involved in sporulation part of the SET3C complex, which is required to repress early/middle sporulation genes during meiosis ([PMID:11711434]). The bacterial proteins are likely to be involved in a cell wall function as they are found in conjunction with the Pfam:PF07501 domain, which is involved in various cell surface processes. This domain is also present in the resuscitation-promoting factors RpfB from Mycobacterium tuberculosis and Rpf2 from Corynebacterium glutamicum. These are factors that stimulate resuscitation of dormant cells [PMID:19799629]. This domain has a beta grasp fold. Structural description of this domain revealed a structural conservation between these domains and ubiquitin, hence it is termed UBL-G5 [PMID:26549874].", "816": "PF00095\nWAP-type (Whey Acidic Protein) 'four-disulfide core'\nWAP belongs to the group of Elafin or elastase-specific inhibitors.", "817": "PF04613\nUDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase, LpxD\nUDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase (EC 2.3.1.-) catalyses an early step in lipid A biosynthesis: UDP-3-O-(3-hydroxytetradecanoyl)glucosamine + (R)-3-hydroxytetradecanoyl- [acyl carrier protein] -> UDP-2,3-bis(3-hydroxytetradecanoyl)glucosamine + [acyl carrier protein] [PMID:8366125]. Members of this family also contain a hexapeptide repeat (Pfam:PF00132). This family constitutes the non-repeating region of LPXD proteins.", "818": "PF01352\nKRAB box\nThe KRAB domain (or Kruppel-associated box) is present in about a third of zinc finger proteins containing C2H2 fingers. The KRAB domain is found to be involved in protein-protein interactions [PMID:8986806, PMID:8769649]. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B. The A box plays an important role in repression by binding to corepressors, while the B box is thought to enhance this repression brought about by the A box. KRAB-containing proteins are thought to have critical functions in cell proliferation and differentiation, apoptosis and neoplastic transformation [PMID:14519192].", "819": "PF03993\nDomain of Unknown Function (DUF349)\nThis domain is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three alpha-helices, and possibly a beta-strand.", "820": "PF04963\nSigma-54 factor, core binding domain\nThis domain makes a direct interaction with the core RNA polymerase, to form an enhancer dependent holoenzyme [PMID:10894718]. The centre of this domain contains a very weak similarity to a helix-turn-helix motif which may represent the other DNA binding domain.", "821": "PF02365\nNo apical meristem (NAM) protein\nThis is a family of no apical meristem (NAM) proteins these are plant development proteins. Mutations in NAM result in the failure to develop a shoot apical meristem in petunia embryos [PMID:8612269]. NAM is indicated as having a role in determining positions of meristems and primordial [PMID:8612269]. One member of this family NAP (NAC-like, activated by AP3/PI) is encoded by the target genes of the AP3/PI transcriptional activators and functions in the transition between growth by cell division and cell expansion in stamens and petals [PMID:9489703].", "822": "PF01774\nUreD urease accessory protein\nUreD is a urease accessory protein. Urease Pfam:PF00449 hydrolyses urea into ammonia and carbamic acid [PMID:8550495]. UreD is involved in activation of the urease enzyme via the UreD-UreF-UreG-urease complex [PMID:9209019] and is required for urease nickel metallocenter assembly [PMID:7909161]. See also UreF Pfam:PF01730, UreG Pfam:PF01495. ", "823": "FAD binding domain of DNA photolyase", "824": "PF05658\nYadA head domain repeat (2 copies)\nThis entry represents two copies of a fourteen residue repeat that makes up the head domain of bacterial haemagglutinins and invasins.", "825": "Glycyl-tRNA synthetase beta subunit", "826": "PF05173\nDihydrodipicolinate reductase, C-terminus\nDihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The C-terminal domain of DapB has been proposed to be the substrate- binding domain.", "827": "PF03453\nMoeA N-terminal region (domain I and II)\nThis family contains two structural domains. One of these contains the conserved DGXA motif. This region is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this region is uncertain.", "828": "PF13677\nMembrane MotB of proton-channel complex MotA/MotB\nThis is the MotB member of the E.coli MotA/MotB proton-channel complex that forms the stator of the bacterial membrane flagellar motor. Key residues act as a plug to prevent premature proton flow. The plug is in the periplasm just C-terminal to the MotB TM, consisting of an amphipathic alpha helix flanked by Pro-52 and Pro-65, eg in Swiss:D3V2T1. In addition to the Pro residues, Ile-58, Tyr-61, and Phe 62 are also essential for plug function [PMID:17052729][PMID:14705929].", "829": "NAD(P)H-binding", "830": "PF06418\nCTP synthase N-terminus\nThis family consists of the N-terminal region of the CTP synthase protein (EC:6.3.4.2). This family is found in conjunction with Pfam:PF00117 located in the C-terminal region of the protein. CTP synthase catalyses the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position [PMID:12522217].", "831": "PF00349\nHexokinase\nHexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and Pfam:PF03727. Some members of the family have two copies of each of these domains.", "832": "PF13894\nC2H2-type zinc finger\nThis family contains a number of divergent C2H2 type zinc fingers.", "833": "PF01266\nFAD dependent oxidoreductase\nThis family includes various FAD dependent oxidoreductases: Glycerol-3-phosphate dehydrogenase EC:1.1.99.5, Sarcosine oxidase beta subunit EC:1.5.3.1, D-alanine oxidase EC:1.4.99.1, D-aspartate oxidase EC:1.4.3.1.", "834": "Ribosomal protein S18", "835": "PF03775\nSeptum formation inhibitor MinC, C-terminal domain\nIn Escherichia coli Swiss:P06138 assembles into a Z ring at midcell while assembly at polar sites is prevented by the min system. MinC Swiss:P18196 a component of this system, is an inhibitor of FtsZ assembly that is positioned within the cell by interaction with MinDE. MinC is an oligomer, probably a dimer [PMID:10869074]. The C terminal half of MinC is the most conserved and interacts with MinD. The N terminal half is thought interact with FtsZ.", "836": "PF13720\nUdp N-acetylglucosamine O-acyltransferase; Domain 2\nThis is domain 2, or the C-terminal domain, of Udp N-acetylglucosamine O-acyltransferase. This enzyme is a zinc-dependent enzyme that catalyses the deacetylation of UDP-3-O-((R)-3-hydroxymyristoyl)-N-acetylglucosamine to form UDP-3-O-(R-hydroxymyristoyl)glucosamine and acetate.", "837": "Ribosomal protein L33", "838": "Molybdopterin cofactor-binding domain", "839": "PF01808\nAICARFT/IMPCHase bienzyme\nThis is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase EC:2.1.2.3 (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate [PMID:9332377]. This is catalysed by a pair of C-terminal deaminase fold domains in the protein [PMID:21890906], where the active site is formed by the dimeric interface of two monomeric units [PMID:21890906]. The last step is catalysed by the N-terminal IMP (Inosine monophosphate) cyclohydrolase domain EC:3.5.4.10 (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP [PMID:9332377]. ", "840": "Brix domain", "841": "PF14693\nRibosomal protein TL5, C-terminal domain\nThis family contains the C-terminal domain of ribosomal protein TL5. The N-terminal domain, which binds to 5S rRNA, is contained in family Ribosomal_L25p, Pfam:PF01386. Full length (N- and C-terminal domain) homologues of TL5 are also known as CTC proteins. TL5 or CTC are not found in Eukarya or Archaea. In some Bacteria, including E. coli, this ribosomal subunit occurs as a single domain protein (named Ribosomal subunit L25), where the only domain is homologous to TL5 N-terminal domain (hence included in family Pfam:PF01386). The function of the C-terminal domain of TLC is at present unknown.", "842": "PF00401\nATP synthase, Delta/Epsilon chain, long alpha-helix domain\nPart of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (Pfam:PF00213).", "843": "PF18345\nZinc finger domain\nThis is a zinc finger domain found in Zinc finger CCCH-type with G patch domain-containing proteins such as ZIP. Functional studies indicate that ZIP specifically targets EGFR and represses its transcription, and that the zinc finger and the coiled-coil domains are central to that process [PMID:19644445].", "844": "PF00285\nCitrate synthase, C-terminal domain\nThis is the long, C-terminal part of the enzyme.", "845": "PF17136\nRibosomal proteins 50S L24/mitochondrial 39S L24\nThis is the family of bacterial 50S ribosomal subunit proteins L24. It also carries some mitochondrial 39S L24 proteins.", "846": "PF02700\nPhosphoribosylformylglycinamidine (FGAM) synthase\nThis family forms a component of the de novo purine biosynthesis pathway. ", "847": "PF06201\nPITH domain\nThis family was formerly known as DUF1000. The full-length, Txnl1, protein which is a probable component of the 26S proteasome, uses its C-terminal, PITH, domain to associate specifically with the 26S proteasome. PITH derives from proteasome-interacting thioredoxin domain.", "848": "PF04749\nPLAC8 family\nThis family includes Swiss:Q9NZF1, the Placenta-specific gene 8 protein.", "849": "Translation initiation factor IF-3, C-terminal domain", "850": "PF06429\nFlagellar basal body rod FlgEFG protein C-terminal\nThis family consists of a number of C-terminal domains of unknown function. This domain seems to be specific to flagellar basal-body rod and flagellar hook proteins in which Pfam:PF00460 is often present at the extreme N terminus.", "851": "NADH-ubiquinone/plastoquinone oxidoreductase, chain 3", "852": "PF17963\nBacterial Ig domain\nThis entry represents a wide variety of bacterial Ig domains.", "853": "PF04468\nPSP1 C-terminal conserved region\nThis region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast [PMID:9529527].", "854": "PF10576\nIron-sulfur binding domain of endonuclease III\nEscherichia coli endonuclease III (EC 4.2.99.18) [PMID:7664751] is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidised pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand [PMID:9045706]. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease [PMID:16967954].", "855": "PF01925\nSulfite exporter TauE/SafE\nThis is a family of integral membrane proteins where the alignment appears to contain two duplicated modules of three transmembrane helices. The proteins are involved in the transport of anions across the cytoplasmic membrane [PMID:17768248] during taurine metabolism as an exporter of sulfoacetate [PMID:18506422]. This family used to be known as DUF81.", "856": "PF03478\nProtein of unknown function (DUF295)\nThis domain is found in plant proteins of unknown function. It can be found in association with F-box domain Pfam:PF00646.", "857": "PF01894\nUncharacterised protein family UPF0047\nThis family has no known function. The alignment contains a conserved aspartate and histidine that may be functionally important.", "858": "PF16321\nSigma 54 modulation/S30EA ribosomal protein C terminus\nThis domain often occurs at the C-terminus of proteins containing Pfam:PF02482.", "859": "PF01388\nARID/BRIGHT DNA binding domain\nThis domain is know as ARID for AT-Rich Interaction Domain [PMID:8543152], and also known as the BRIGHT domain [PMID:8622680].", "860": "PF07075\nProtein of unknown function (DUF1343)\nThis family consists of several hypothetical bacterial proteins of around 400 residues in length. The function of this family is unknown.", "861": "Bacterial regulatory helix-turn-helix protein, lysR family", "862": "PF00908\ndTDP-4-dehydrorhamnose 3,5-epimerase\nThis family catalyse the isomerisation of dTDP-4-dehydro-6-deoxy -D-glucose with dTDP-4-dehydro-6-deoxy-L-mannose. The EC number of this enzyme is 5.1.3.13.", "863": "PF01805\nSurp module\nThis domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding [PMID:8206918].", "864": "Signal peptidase (SPase) II", "865": "PF06415\nBPG-independent PGAM N-terminus (iPGM_N)\nThis family represents the N-terminal region of the 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (or phosphoglyceromutase or BPG-independent PGAM) protein (EC:5.4.2.1). The family is found in conjunction with Pfam:PF01676 (located in the C-terminal region of the protein).", "866": "PF02401\nLytB protein\nThe mevalonate-independent 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway for isoprenoid biosynthesis is essential in many eubacteria, plants, and the malaria parasite. The LytB gene is involved in the trunk line of the MEP pathway.", "867": "PF05681\nFumarate hydratase (Fumerase)\nThis family consists of several bacterial fumarate hydratase proteins FumA and FumB. Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. Three fumarases, FumA, FumB, and FumC, have been reported in E. coli. fumA and fumB genes are homologous and encode products of identical sizes which form thermolabile dimers of Mr 120,000. FumA and FumB are class I enzymes and are members of the iron-dependent hydrolases, which include aconitase and malate hydratase. The active FumA contains a 4Fe-4S centre, and it can be inactivated upon oxidation to give a 3Fe-4S centre [PMID:11133938].", "868": "PF03880\nDbpA RNA binding domain\nThis RNA binding domain is found at the C-terminus of a number of DEAD helicase proteins [PMID:10481020]. It is sufficient to confer specificity for hairpin 92 of 23S rRNA, which is part of the ribosomal A-site. However, several members of this family lack specificity for 23S rRNA. These can proteins can generally be distinguished by a basic region that extends beyond this domain [Karl Kossen, unpublished data].", "869": "NADH-ubiquinone oxidoreductase-G iron-sulfur binding region", "870": "PF02233\nNAD(P) transhydrogenase beta subunit\nThis family corresponds to the beta subunit of NADP transhydrogenase in prokaryotes, and either the protein N- or C terminal in eukaryotes. The domain is often found in conjunction with Pfam:PF01262. Pyridine nucleotide transhydrogenase catalyses the reduction of NAD+ to NADPH. A complete loss of activity occurs upon mutation of Gly314 in E. coli [PMID:1633824].", "871": "PF03255\nAcetyl co-enzyme A carboxylase carboxyltransferase alpha subunit\nAcetyl co-enzyme A carboxylase carboxyltransferase is composed of an alpha and beta subunit.", "872": "PF13976\nGAG-pre-integrase domain\nThis domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.", "873": "PF04333\nMlaA lipoprotein\nMlaA is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane [PMID:19383799]. MlaA is required for the intercellular spreading of Shigella flexneri. It is attached to the outer membrane by a lipid anchor [PMID:8145644].", "874": "PF02574\nHomocysteine S-methyltransferase\nThis is a family of related homocysteine S-methyltransferases enzymes: 5-methyltetrahydrofolate--homocysteine S-methyltransferases also known EC:2.1.1.13, [PMID:9013615]; Betaine--homocysteine S-methyltransferase (vitamin B12 dependent), EC:2.1.1.5, [PMID:8798461]; and Homocysteine S-methyltransferase, EC:2.1.1.10, [PMID:9882684].", "875": "PF02772\nS-adenosylmethionine synthetase, central domain\nThe three domains of S-adenosylmethionine synthetase have the same alpha+beta fold.", "876": "PF07331\nTripartite tricarboxylate transporter TctB family\nThis family consists of several hypothetical bacterial proteins of around 150 residues in length. This family was formerly known as DUF1468.", "877": "PF00303\nThymidylate synthase\nThis is a family of proteins that are flavin-dependent thymidylate synthases.", "878": "Respiratory-chain NADH dehydrogenase, 49 Kd subunit", "879": "Ribosomal protein L20", "880": "PF16916\nDimerisation domain of Zinc Transporter\nZT_dimer is the dimerisation region of the whole molecule of zinc transporters since the full-length members form a homodimer during activity. The domain lies within the cytoplasm and exhibits an overall structural similarity with the copper metallochaperone Hah1 UniProtKB:O00244, exhibiting an open alpha-beta domain with two alpha helices (H1 and H2) aligned on one side and a three-stranded mixed beta-sheet (S1 to S3) on the other side. The N-terminal part of the members is the Cation_efflux family, Pfam:PF01545 [PMID:17717154].", "881": "PF01192\nRNA polymerase Rpb6\nRpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. This family also contains the bacterial equivalent to Rpb6, the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly [PMID:11158566].", "882": "PF05184\nSaposin-like type B, region 1\nSaposin B is a small non-enzymatic glycoprotein required for the breakdown of cerebroside sulphates (sulphatides) in lysosomes. Saposin B contains three intramolecular disulphide bridges, exists as a dimer and is remarkably heat, protease, and pH stable. The crystal structure of human saposin B reveals an unusual shell-like dimer consisting of a monolayer of alpha-helices enclosing a large hydrophobic cavity [PMID:7610480, PMID:12518053]. It is one of the most studied members of the saposin protein family and it is involved in the hydrolysis of glycolipids and glycerolipids. SapB is unique in the saposin family in that it facilitates degradation by interacting with the substrate, not the enzymes [PMID:26616259].", "883": "PF08378\nNuclease-related domain\nThe nuclease-related domain (NERD) is found in a range of bacterial as well as archaeal and plant proteins. It has distant similarity to endonucleases (hence its name) and its predicted secondary structure is helix - sheet - sheet - sheet - sheet - weak sheet/long loop - helix - sheet - sheet. The majority of NERD-containing proteins are single-domain, but in several cases proteins containing NERD have additional domains which in 75% of cases are involved in DNA processing [PMID:15055202].", "884": "PF12019\nType II transport protein GspH\nGspH is involved in bacterial type II export systems [PMID:18241884]. Like all pilins, GspH has an N terminus alpha helix [PMID:18241884]. This helix is followed by nine beta strands forming two beta sheets, one of five antiparallel strands and one of four antiparallel strands [PMID:18241884]. GspH is a minor pseudopilin; it is expressed much less than other pseudopilins in the type II secretion pilus (major pilins) [PMID:18241884]. The function and localisation of minor pseudo-pilins are still to be fully unraveled [PMID:18241884]. It has been suggested that some minor pseudopilins may assemble either into the base or the tip of pili, or both. They function as initiators or regulators of pilus biogenesis and dynamics, and/or as adaptors between various pseudopilin component and other members of the T2SS [PMID:18241884].", "885": "PF10143\n2,3-bisphosphoglycerate-independent phosphoglycerate mutase\nThis family represents 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (iPGAM), a metalloenzyme found particularly in archaea and some eubacteria, which catalyses the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [(2R)-2-phosphoglycerate = (2R)-3-phosphoglycerate] (EC 5.4.2.12) [PMID:17576516].", "886": "PF01988\nVIT family\nThis family includes the vacuolar Fe2+/Mn2+ uptake transporter Swiss:P47818, Ccc1 [PMID:7941738] and the vacuolar iron transporter VIT1 Swiss:Q9ZUA5.", "887": "PF03788\nLrgA family\nThis family is uncharacterised. It contains the protein LrgA that has been hypothesised to export murein hydrolases [PMID:8824633].", "888": "Transcriptional regulatory protein, C terminal", "889": "Transcription factor TFIID (or TATA-binding protein, TBP)", "890": "PF14714\nKH-domain-like of EngA bacterial GTPase enzymes, C-terminal\nThe KH-like domain at the C-terminus of the EngA subfamily of essential bacterial GTPases has a unique domain structure position. The two adjacent GTPase domains (GD1 and GD2), two domains of family MMR_HSR1, Pfam:PF01926, pack at either side of the C-terminal domain. This C-terminal domain resembles a KH domain but is missing the distinctive RNA recognition elements. Conserved motifs of the nucleotide binding site of GD1 are integral parts of the GD1-KH domain interface, suggesting the interactions between these two domains are directly influenced by the GTP/GDP cycling of the protein. In contrast, the GD2-KH domain interface is distal to the GDP binding site of GD2. This family has not been added to the KH clan since SCOP classifies it separately due to its missing the key KH motif/fold.", "891": "PF02805\nMetal binding domain of Ada\nThe Escherichia coli Ada protein repairs O6-methylguanine residues and methyl phosphotriesters in DNA by direct transfer of the methyl group to a cysteine residue. This domain contains four conserved cysteines that form a zinc binding site [PMID:1581309, PMID:8500619]. One of these cysteines is a methyl group acceptor. The methylated domain can then specifically bind to the ada box on a DNA duplex [PMID:8500619].", "892": "PF13382\nAdenine deaminase C-terminal domain\nThis family represents a C-terminal region of the adenine deaminase enzyme.", "893": "PF16486\nN-terminal domain of argonaute\nArgoN is the N-terminal domain of argonaute proteins in eukaryotes. ArgoN is composed of an antiparallel four-stranded beta sheet core that has two alpha helices positioned along one face of the sheet and an extended beta strand towards its N-terminus. The core fold of the N domain most closely resembles the catalytic domain of replication-initiator protein Rep. The N domain is linked to the PAZ domain via linker 1 region, and together these three regions are designated the PAZ-containing lobe of argonaute.", "894": "PF04060\nPutative Fe-S cluster\nThis family includes a domain with four conserved cysteines that probably form an Fe-S redox cluster.", "895": "PF07500\nTranscription factor S-II (TFIIS), central domain\nTranscription elongation by RNA polymerase II is regulated by the general elongation factor TFIIS. This factor stimulates RNA polymerase II to transcribe through regions of DNA that promote the formation of stalled ternary complexes. TFIIS is composed of three structural domains, termed I, II, and III. The two C-terminal domains (II and III), this domain and Pfam:PF01096 are required for transcription activity [PMID:8855225].", "896": "PF02885\nGlycosyl transferase family, helical bundle domain\nThis family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate.", "897": "PF10396\nGTP-binding protein TrmE N-terminus\nThis family represents the shorter, B, chain of the homo-dimeric structure which is a guanine nucleotide-binding protein that binds and hydrolyses GTP. TrmE is homologous to the tetrahydrofolate-binding domain of N,N-dimethylglycine oxidase and indeed binds formyl-tetrahydrofolate. TrmE actively participates in the formylation reaction of uridine and regulates the ensuing hydrogenation reaction of a Schiff's base intermediate. This B chain is the N-terminal portion of the protein consisting of five beta-strands and three alpha helices and is necessary for mediating dimer formation within the protein [PMID:15616586].", "898": "PF13478\nXdhC Rossmann domain\nThis entry is the rossmann domain found in the Xanthine dehydrogenase accessory protein.", "899": "PF03988\nRepeat of Unknown Function (DUF347)\nThis repeat is found as four tandem repeats in a family of bacterial membrane proteins. Each repeat contains two transmembrane regions and a conserved tryptophan.", "900": "PF14804\nJag N-terminus\nThis domain is found at the N-terminus of proteins containing Pfam:PF13083 and Pfam:PF01424, including the jag proteins.", "901": "PF01769\nDivalent cation transporter\nThis region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.)", "902": "PF00596\nClass II Aldolase and Adducin N-terminal domain\nThis family includes class II aldolases and adducins which have not been ascribed any enzymatic function.", "903": "PF01428\nAN1-like Zinc finger\nZinc finger at the C-terminus of An1 Swiss:Q91889, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues.", "904": "PF09995\nER-bound oxygenase mpaB/B'/Rubber oxygenase, catalytic domain\nThis is the catalytic domain found in the endoplasmic reticulum (ER) -bound oxygenases mpaB' (MPAB2) and mpaB (MPAB) from Penicillium roqueforti and Penicillium brevicompactum and in the rubber oxygenase (Lcp) from Streptomyces sp., which contains highly conserved arginine and histidine residues. Structural analysis from Lcp revealed that Arg164, Thr168 and His198 are crucial active site residues [PMID:28733658]. The mpaB and mpaB' are part of the gene cluster that mediates the biosynthesis of mycophenolic acid (MPA) [PMID:31209052]. Lcp (Latex clearing proteins) is a rubber oxygenase that catalyses the extracellular cleavage of poly (cis-1,4-isoprene) [PMID:28733658, PMID:25819959]. This domain is also present in uncharacterised proteins from Mycobacterium sp. and hypothetical proteins, mainly from bacteria and fungi.", "905": "PF01809\nPutative membrane protein insertion efficiency factor\nThis family consists of membrane insertion efficiency factor proteins. They contain three conserved cysteine residues. Family members such as YidD may be involved in insertion of integral membrane proteins into the membrane by assisting YidC (membrane protein insertase). Some members of the yidD family have been previously thought to posses alpha-hemolysin activity, however no sufficient evidence was found to corroborate this idea. Secondary structure prediction indicated the presence of three alpha-helices in YidD. None of the three alpha-helices appeared sufficiently hydrophobic to serve as a transmembrane, suggesting a cytoplasmic localization for YidD. However, a closer examination of the alpha-helical wheel projection of the predicted first alpha-helix in YidD suggested an amphipathic structure in its N-terminal region which might be involved in membrane targeting [PMID:21803992].", "906": "Catalase", "907": "Ribosomal protein S8", "908": "PF01715\nIPP transferase\nThis is a family of IPP transferases EC:2.5.1.8 also known as tRNA delta(2)-isopentenylpyrophosphate transferase. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37) [PMID:8139535].", "909": "Glycosyl hydrolase family 9", "910": "PF00986\nDNA gyrase B subunit, carboxyl terminus\nThe amino terminus of eukaryotic and prokaryotic DNA topoisomerase II are similar, but they have a different carboxyl terminus. The amino-terminal portion of the DNA gyrase B protein is thought to catalyse the ATP-dependent super-coiling of DNA. See Pfam:PF00204. The carboxyl-terminal end supports the complexation with the DNA gyrase A protein and the ATP-independent relaxation. This family also contains Topoisomerase IV. This is a bacterial enzyme that is closely related to DNA gyrase, [PMID:7770916].", "911": "PF06689\nClpX C4-type zinc finger\nThe ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known.", "912": "PF01804\nPenicillin amidase\nPenicillin amidase or penicillin acylase EC:3.5.1.11 catalyses the hydrolysis of benzylpenicillin to phenylacetic acid and 6-aminopenicillanic acid (6-APA) a key intermediate in the the synthesis of penicillins [PMID:9292993]. Also in the family is cephalosporin acylase Swiss:P07662 and Swiss:P29958 aculeacin A acylase which are involved in the synthesis of related peptide antibiotics.", "913": "PF09298\nFumarylacetoacetase N-terminal\nThe N-terminal domain of fumarylacetoacetate hydrolase is functionally uncharacterised, and adopts a structure consisting of an SH3-like barrel [PMID:11154690].", "914": "PF10996\nBeta-Casp domain\nThe beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains [PMID:17128255].", "915": "Glycosyl transferases group 1", "916": "PF00925\nGTP cyclohydrolase II\nGTP cyclohydrolase II catalyses the first committed step in the biosynthesis of riboflavin.", "917": "RNase H-like domain found in reverse transcriptase", "918": "PF03119\nNAD-dependent DNA ligase C4 zinc finger domain\nDNA ligases catalyse the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilising either ATP or NAD(+) as a cofactor [PMID:10698952]. This family is a small zinc binding motif that is presumably DNA binding [PMID:10698952]. IT is found only in NAD dependent DNA ligases [PMID:10698952].", "919": "PF06803\nProtein of unknown function (DUF1232)\nThis family represents a conserved region of approximately 60 residues within a number of hypothetical bacterial and archaeal proteins of unknown function.", "920": "PF02910\nFumarate reductase flavoprotein C-term\nThis family contains fumarate reductases, succinate dehydrogenases and L-aspartate oxidases.", "921": "PF03799\nCell division protein FtsQ/DivIB, C-terminal\nFtsQ is an essential cell division protein. It may link together the upstream cell division proteins, which are predominantly cytoplasmic, with the downstream cell division proteins, which are predominantly periplasmic [PMID:17185541]. FtsQ may control the correct divisome assembly [PMID:19233928]. DivIB is a cell division protein from Gram-positive bacteria, probably homologous to Escherichia coli FtsQ. DivIB interacts with FtsL, DivIC and PBP-2B [PMID:16936019, PMID:20870765]. DivIB plays an essential role in division at high temperatures, maybe by protecting FtsL from degradation or by promoting formation of the FtsL-DivIC complex [PMID:10792716]. It is also required for efficient sporulation at all temperatures [PMID:16936026]. FtsQ and DivIB have a short N-terminal cytoplasmic domain and a larger C-terminal periplasmic domain [PMID:19233928, PMID:20870765]. This entry represents the C-terminal region.", "922": "PF02586\nSOS response associated peptidase (SRAP)\nThe SRAP family functions as a DNA-associated autoproteolytic switch that recruits diverse repair enzymes onto DNA damage. We propose that the human protein Q96FZ2:UniProtKB, the eukaryotic member of the SRAP family, which has been recently shown to bind specifically to DNA with 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxycytosine, is a sensor for these oxidized bases generated by the TET (tetrahedral aminopeptidase of the M42 family) enzymes from methylcytosine. Hence, its autoproteolytic activity might help it act as a switch that recruits DNA repair enzymes to remove these oxidized methylcytosine species as part of the DNA demethylation pathway downstream of the TET enzymes.", "923": "PF12673\nDomain of unknown function (DUF3794)\nThis presumed domain is functionally uncharacterised. This domain family is found in bacteria, and is approximately 90 amino acids in length. The family is found in association with Pfam:PF01476.", "924": "PF01693\nCaulimovirus viroplasmin\nThis family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms [PMID:2402462]. Inclusions are the site of viral assembly, DNA synthesis and accumulation [PMID:2402462]. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner [PMID:8372449].", "925": "PF08742\nC8 domain\nThis domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing Pfam:PF00094 and Pfam:PF01826.", "926": "PF01987\nMitochondrial biogenesis AIM24\nIn eukaryotes, this domain is involved in mitochondrial biogenesis [PMID:19300474]. Its function in prokaryotes in unknown.", "927": "PF16344\nDomain of unknown function (DUF4974)\nThis family consists of uncharacterized proteins around 340 residues in length and is mainly found in various Bacteroides and Parabacterodies species. The function of this protein is unknown.", "928": "PF07715\nTonB-dependent Receptor Plug Domain\nThe Plug domain has been shown to be an independently folding subunit of the TonB-dependent receptors ([PMID:15111112]). It acts as the channel gate, blocking the pore until the channel is bound by ligand. At this point it under goes conformational changes opens the channel.", "929": "Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain", "930": "PF04000\nSas10/Utp3/C1D family\nThis family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex [PMID:12068309][PMID:10690410]. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs [PMID:12972615], and Sas10 which has been identified as a regulator of chromatin silencing [PMID:9611201]. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein [PMID:16705177].", "931": "PF07501\nG5 domain\nThis domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function.", "932": "PF04295\nD-galactarate dehydratase / Altronate hydrolase, C terminus\nFamily members include the C termini of D-galactarate dehydratase (EC:4.2.1.42) which is thought to catalyse the reaction D-galactarate = 5-keto-4-deoxy-D-glucarate + H2O, [PMID:9772162] and altronate hydrolase (altronic acid hydratase, EC:4.2.1.7), which catalyses D-altronate = 2-keto-2-deoxygluconate + H2O [PMID:9579062]. As purified, both enzymes are catalytically inactive in the absence of added Fe2+, Mn2+, and beta-mercaptoethanol. Synergistic activation of altronate hydrolase activity is seen in the presence of both iron and manganese ions, suggesting that the enzyme may have two ion binding sites. Mn2+ appears to be part of the enzyme active centre, but the function of the single bound Fe2+ ion is unknown. The hydratase has no Fe-S core [PMID:3038546].", "933": "PF03755\nYicC-like family, N-terminal region\nFamily of bacterial proteins. Although poorly characterised, the members of this protein family have been demonstrated to play a role in stationary phase survival [PMID:1925027]. These proteins are not essential during stationary phase [PMID:1925027].", "934": "PF01713\nSmr domain\nThis family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 Swiss:P23909 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2 Swiss:P94545 [PMID:10431172]. This domain exhibits nicking endonuclease activity that might have a role in mismatch repair or genetic recombination. It shows no significant double strand cleavage or exonuclease activity [PMID:12730195]. The full-length Swiss:Q86UW6 also has the polynucleotide kinase activity.", "935": "PF01724\nDomain of unknown function DUF29\nThis family consists of various hypothetical proteins from cyanobacteria, none of which are functionally described. The aligned region is approximately 120-140 amino acids long corresponding to almost the entire length of the proteins in the family. Swiss:Q2RPE2, PDB:3fcn, is a small protein that has a novel all-alpha fold. The N-terminal helical hairpin is likely to function as a dimerisation module. This protein is a member of PFam family PF01724. The function of this protein is unknown. One protein sequence contains a fusion of this protein and a DnaB domain, suggesting a possible role in DNA helicase activity (hypothetical). Dali hits have low Z and high rmsd, suggesting probably only topological similarities (not functional relevance) (details derived from TOPSAN). The family has several highly conserved sequence motifs, including YD/ExD, DxxNVxEEIE, and CPY/F/W, as well as conserved tryptophans.", "936": "Uncharacterized protein family UPF0029", "937": "PF02615\nMalate/L-lactate dehydrogenase\nThis family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyses the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyses the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively.", "938": "PF02580\nD-Tyr-tRNA(Tyr) deacylase\nThis family comprises of several D-Tyr-tRNA(Tyr) deacylase proteins. Cell growth inhibition by several d-amino acids can be explained by an in vivo production of d-aminoacyl-tRNA molecules. Escherichia coli and yeast cells express an enzyme, d-Tyr-tRNA(Tyr) deacylase, capable of recycling such d-aminoacyl-tRNA molecules into free tRNA and d-amino acid. Accordingly, upon inactivation of the genes of the above deacylases, the toxicity of d-amino acids increases. Orthologues of the deacylase are found in many cells [PMID:11568181].The D-aminoacyl-tRNA deacylase (DTD) enzyme is homodimeric with two active sites located at the dimeric interface. Each active site carries an invariant Gly-cisPro dipeptide motif in each monomer. The interaction between the dipeptide motifs from each monomer ensures substrate stereospecificity. This family also includes a subclass of DTDs which is present in Chordata and harbors a Gly-transPro motif. The cis to trans switch is the key to Animal DTDs (ATD) gaining of L-chiral selectivity. This 'gain of function' through relaxation of substrate chiral specificity underlies ATD's capability of correcting the error in tRNA selection [PMID:29410408].", "939": "SmpB protein", "940": "PF01990\nATP synthase (F/14-kDa) subunit\nThis family includes 14-kDa subunit from vATPases [PMID:8682310], which is in the peripheral catalytic part of the complex [PMID:8621738]. The family also includes archaebacterial ATP synthase subunit F [PMID:8702544].", "941": "PF02325\nYGGT family\nThis family includes subunit CCB3 of cofactor assembly complex C, involved in c-type cytochrome maturation in photosynthetic organisms [PMID:18593701]. This family also includes uncharacterised protein YggT, associated with bacteria outside the cyanobacteria.", "942": "PF04547\nCalcium-activated chloride channel\nThe family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes [PMID:18724360].", "943": "PF16113\nEnoyl-CoA hydratase/isomerase\nThis family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. This family differs from Pfam:PF00378 in the structure of it's C-terminus.", "944": "PF17862\nAAA+ lid domain\nThis entry represents the alpha helical AAA+ lid domain that is found to the C-terminus of AAA domains.", "945": "PF14667\nPolysaccharide biosynthesis C-terminal domain\nThis family represents the C-terminal integral membrane region of polysaccharide biosynthesis proteins.", "946": "PF16874\nGlycosyl hydrolase family 36 C-terminal domain\nThis domain is found at the C-terminus of many family 36 glycoside hydrolases. It has a beta-sandwich structure with a Greek key motif [PMID:23012371].", "947": "AzlC protein", "948": "Glucose-6-phosphate dehydrogenase, C-terminal domain", "949": "PF08818\nDomain of unknown function (DU1801)\nThis domain is found in the Intracellular iron chaperone frataxin YdhG [PMID:21744456] and the uncharacterised protein YdeI from Bacillus subtilis. ", "950": "PF00036\nEF hand\nThe EF-hands can be divided into two classes: signalling proteins and buffering/transport proteins. The first group is the largest and includes the most well-known members of the family such as calmodulin, troponin C and S100B. These proteins typically undergo a calcium-dependent conformational change which opens a target binding site. The latter group is represented by calbindin D9k and do not undergo calcium dependent conformational changes.", "951": "Homoserine dehydrogenase", "952": "Carboxypeptidase regulatory-like domain", "953": "PF02985\nHEAT repeat\nThe HEAT repeat family is related to armadillo/beta-catenin-like repeats (see Pfam:PF00514).", "954": "PF08497\nRadical SAM N-terminal\nThis domain tends to occur to the N-terminus of the Pfam:PF04055 domain in hypothetical bacterial proteins.", "955": "PF04234\nCopC domain\nCopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm [PMID:1924351].", "956": "PF17517\nIgGFc binding protein\nThis domain is found at the N terminal of Swiss:Q9Y6R7 and has been shown to confer IgG Fc binding activity [PMID:9182547]. It may play a role in immune protection and inflammation in the intestines of primates [PMID:9182547].", "957": "PF04149\nDomain of unknown function (DUF397)\nThe function of this family is unknown. ", "958": "PF12662\nComplement Clr-like EGF-like\ncEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue.", "959": "PF07562\nNine Cysteines Domain of family 3 GPCR\nThis conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the Pfam:PF00003 in several receptor proteins.", "960": "PF09994\nUncharacterized alpha/beta hydrolase domain (DUF2235)\nThis domain, found in various hypothetical bacterial proteins, has no known function.", "961": "PF16491\nCAAX prenyl protease N-terminal, five membrane helices\nThe five N-terminal five transmembrane alpha-helices of peptidase_M48 family proteins including the CAAX prenyl proteases reside completely within the membrane of the endoplasmic reticulum.", "962": "PF03929\nPepSY-associated TM region\nThe PepSY_TM family is so named because it is an alignment of up to five transmembranes helices found in bacterial species some of which carry a nested PepSY domain, Pfam:PF03413.", "963": "PF07662\nNa+ dependent nucleoside transporter C-terminus\nThis family consists of nucleoside transport proteins. Swiss:Q62773 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane [PMID:7775409]. Swiss:Q62674 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC [PMID:8027026]. This alignment covers the C-terminus of this family of transporters.", "964": "PF00984\nUDP-glucose/GDP-mannose dehydrogenase family, central domain\nThe UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate [PMID:9013585].", "965": "PF04777\nErv1 / Alr family\nBiogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter [PMID:11493598]. ", "966": "PF03091\nCutA1 divalent ion tolerance protein\nSeveral gene loci with a possible involvement in cellular tolerance to copper have been identified [PMID:7623666]. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognised structural motifs [PMID:9260936]. This family also contains putative proteins from eukaryotes (human and Drosophila).", "967": "PF09369\nMrfA Zn-binding domain\nThis is the C-terminal MrfA Zn+2-binding domain (MZB, also referred to as DUF1998) which contains a conserved four-cysteine signature motif. These four Cys reside in a short coil between two alpha-helices and form a metal ion-binding site [PMID:33300032]. This domain is frequently found at the C-terminal of ndNTPases, however, it is also found encoded in a standalone gene, downstream of putative helicase domain-encoding genes associated with bacterial anti-phage defense system DISARM. MrfA (Mitomycin repair factor A, also known as YprA in Bacillus subtilis) is a DNA helicase that supports repair of mitomycin C-induced DNA damage. MrfA homologues are widely distributed in bacteria and are also present in archaea, fungi and plants. The MrfA-homologue in yeast, Hrq1, also reduces mitomycin C sensitivity. Hrq1 has high similarity to human RecQ4 and was therefore assigned to the RecQ-like helicase family. MrfA homologues appear to be missing in Enterobacteria, however, certain pathogenic Escherichia coli and Salmonella strains harbour Z5898-like helicases with this domain [PMID:33300032].", "968": "PF14420\nClr5 domain\nThis domain is found at the N-terminus of the Clr5 protein which has been shown to be involved in silencing in fission yeast. This domain has been found to often be associated with proteins that contain ankyrin repeats and large regions of disordered sequence [PMID:21253571].", "969": "PF04898\nGlutamate synthase central domain\nThe central domain of glutamate synthase connects the amino terminal amidotransferase domain with the FMN-binding domain and has an alpha / beta overall topology [PMID:11967268]. This domain appears to be a rudimentary form of the FMN-binding TIM barrel according to SCOP.", "970": "PF08379\nBacterial transglutaminase-like N-terminal region\nThis region is found towards the N-terminus of various archaeal and bacterial hypothetical proteins. Some of these are annotated as being transglutaminase-like proteins, and in fact contain a transglutaminase-like superfamily domain (Pfam:PF01841).", "971": "PF01654\nCytochrome bd terminal oxidase subunit I\nThis family are the alternative oxidases found in many bacteria which oxidise ubiquinol and reduce oxygen as part of the electron transport chain. This family is the subunit I of the oxidase E. coli has two copies of the oxidase, bo and bd', both of which are represented here In some nitrogen fixing bacteria, e.g. Klebsiella pneumoniae this oxidase is responsible for removing oxygen in microaerobic conditions, making the oxidase required for nitrogen fixation. This subunit binds a single b-haem, through ligands at His186 and Met393 (using SW:P11026 numbering). In addition His19 is a ligand for the haem b found in subunit II", "972": "PF08340\nDomain of unknown function (DUF1732)\nThis domain of unknown function is often found at the C-terminus of bacterial proteins, many of which are hypothetical, including proteins of the YicC family which have Pfam:PF03755 at the N-terminus. These include a protein important in the stationary phase of growth, and required for growth at high temperature [PMID:1925027]. Structural modelling suggests this domain may bind nucleic acids [PMID:21348639].", "973": "PF01297\nZinc-uptake complex component A periplasmic\nZnuA includes periplasmic solute binding proteins such as TroA that interacts with an ATP-binding cassette transport system in Treponema pallidum [PMID:10404217]. ZnuA is part of the bacterial zinc-uptake complex ZnuABC, whose components are the following families, ZinT, Pfam:PF09223, Pfam:PF00950, Pfam:PF00005, all of which are regulated by the transcription-regulator family FUR, Pfam:PF01475. ZinT acts as a Zn2+-buffering protein that delivers Zn2+ to ZnuA (TroA), a high-affinity zinc-uptake protein. In Gram-negative bacteria the ZnuABC transporter system ensures an adequate import of zinc in Zn2+-poor environments, such as those encountered by pathogens within the infected host [PMID:21338480, PMID:24128931].", "974": "PF03946\nRibosomal protein L11, N-terminal domain\nThe N-terminal domain of Ribosomal protein L11 adopts an alpha/beta fold and is followed by the RNA binding C-terminal domain.", "975": "PF03055\nRetinal pigment epithelial membrane protein\nThis family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria.", "976": "PF01709\nTranscriptional regulator\nThis is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1 [PMID:19503089]. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region [PMID:18641136].", "977": "PF02730\nAldehyde ferredoxin oxidoreductase, N-terminal domain\nAldehyde ferredoxin oxidoreductase (AOR) catalyses the reversible oxidation of aldehydes to their corresponding carboxylic acids with their accompanying reduction of the redox protein ferredoxin. This domain interacts with the tungsten cofactor [PMID:7878465].", "978": "PF03729\nShort repeat of unknown function (DUF308)\nFamily of short repeats that occurs in a limited number of membrane proteins. It may divide further in short repeats of around 7-10 residues of the pattern G-#-X(2)-#(2)-X (#=hydrophobic).", "979": "PF01969\nNickel insertion protein\nMembers of this family may be involved in the activation of nickel-pincer cofactor-dependent enzymes. LarC from Lactobacillus plantarum is involved, together with LarB and LarE, in the synthesis of the enzyme-bound cofactor of lactate racemase (LarA). Larc C binds Ni2+, and functions in nickel delivery to pyridinium-3,5-bisthiocarboxylic acid mononucleotide (P2TMN), to form the mature cofactor [PMID:24710389, PMID:27114550].", "980": "PF03313\nSerine dehydratase alpha chain\nL-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyses the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway.", "981": "PF05164\nCell division protein ZapA\nZapA is a cell division protein which interacts with FtsZ. FtsZ is part of a mid-cell cytokinetic structure termed the Z-ring that recruits a hierarchy of fission related proteins early in the bacterial cell cycle. The interaction of FtsZ with ZapA drives its polymerisation and promotes FtsZ filament bundling thereby contributing to the spatio-temporal tuning of the Z-ring [PMID:15288790][PMID:12368265].", "982": "PF00677\nLumazine binding domain\nThis domain binds to derivatives of lumazine in some proteins. Some proteins have lost the residues involved in binding lumazine.", "983": "phosphotransferase system, EIIB", "984": "PF01980\ntRNA-methyltransferase O\nThis family includes members such as TrmO (tRNA-methyltransferase O) also known as YaeB, which contains a single-sheeted beta-barrel structure. TrmO is an AdoMet-dependent methyltransferase responsible for m6t6A formation [PMID:25063302]. Its human homolog, is responsible for formation of m6t6A37 in cytoplasmic tRNASer. Lack of TrmO decreases attenuation activity of the thr operon, indicating that N6 methylation of m6t6A37 ensures efficient decoding of ACY codons [PMID:27913733]. In bacteria and eukaryotes, TrmO has a C-terminal domain containing the conserved DPRxxY motif. Where the Asp194 and Arg196 in this motif of E. coli TrmO are necessary for N6-methylation. However, no archaeal YaeB has a C-terminal domain containing the DPRxxY motif that is conserved in bacterial and mammalian TrmO homologs [PMID:25063302].", "985": "PF00080\nCopper/zinc superoxide dismutase (SODC)\nsuperoxide dismutases (SODs) catalyse the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene cause familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Structure is an eight-stranded beta sandwich, similar to the immunoglobulin fold.", "986": "Ribosomal protein S2", "987": "PF05192\nMutS domain III\nThis domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with Pfam:PF00488, Pfam:PF05188, Pfam:PF01624 and Pfam:PF05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [PMID:8036718]. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterised in [PMID:11048710]. ", "988": "PF13244\nMBH, subunit D\nHydrogen gas-evolving membrane-bound hydrogenase (MBH) is a respiratory complex homologous to the quinone-reducing Complex I. Like Complex I, MBH has peripheral and membrane arms. MBH is made of 14 subunits (MbhA-N). MbhJ, K, L, N and M form the Membrane-anchored hydrogenase module. MbhJ, K, L, N are predicted to be exposed to the cytoplasm and form the peripheral arm. The remaining 10 subunits are predicted to be integral membrane proteins forming the membrane arm, made of 44 transmembrane helices (TMH) [PMID:33229520, PMID:32735215]. MbhA, B, C and F form the Sodium translocation module. MbhD, E, G and H form the Proton translocation module. MbhI is the linker between the hydrogenase module and the proton-translocating membrane module. It anchors the discontinuous TMH7 of MbhH via its middle lateral helix and the C-terminal of TMH2, found in MbhE. MbhD and MbhE together are equivalent to Nqo10 of Complex I [PMID:29754813]. MbhD has three TM helices.", "989": "PF13932\ntRNA modifying enzyme MnmG/GidA C-terminal domain\nThe GidA associated domain is a domain that has been identified at the C-terminus of protein GidA. It consists of several helices, the last three being rather short and forming small bundle. GidA is an tRNA modification enzyme found in bacteria and mitochondrial. Based on mutational analysis this domain has been suggested to be implicated in binding of the D-stem of tRNA [PMID:19446527] and to be responsible for the interaction with protein MnmE [PMID:18565343]. Structures of GidA in complex with either tRNA or MnmE are missing. Reported to bind to Pfam family MnmE, Pfam:PF12631.", "990": "PF02465\nFlagellar hook-associated protein 2 N-terminus\nThe flagellar hook-associated protein 2 (HAP2 or FliD) forms the distal end of the flagella, and plays a role in mucin specific adhesion of the bacteria [PMID:9488388]. This alignment covers the N-terminal region of this family of proteins.", "991": "Short C-terminal domain", "992": "PF01890\nCobalamin synthesis G C-terminus\nMembers of this family are involved in cobalamin synthesis. The gene encoded by Swiss:P72862 has been designated cbiH but in fact represents a fusion between cbiH and cbiG. As other multi-functional proteins involved in cobalamin biosynthesis catalyse adjacent steps in the pathway, including CysG, CobL (CbiET), CobIJ and CobA-HemD, it is therefore possible that CbiG catalyses a reaction step adjacent to CbiH. In the anaerobic pathway such a step could be the formation of a gamma lactone, which is thought to help to mediate the anaerobic ring contraction process [PMID:9742225]. Within the cobalamin synthesis pathway CbiG catalyses the both the opening of the lactone ring and the extrusion of the two-carbon fragment of cobalt-precorrin-5A from C-20 and its associated methyl group (deacylation) to give cobalt-precorrin-5B [PMID:16936030]. This family is the C-terminal region, and the mid- and N-termival parts are conserved independently in other families.", "993": "PF00387\nPhosphatidylinositol-specific phospholipase C, Y domain\nThis associates with Pfam:PF00388 to form a single structural unit.", "994": "PF03883\nPeroxide stress protein YaaA\nYaaA is a key element of the stress response to H2O2. It acts by reducing the level of intracellular iron levels after peroxide stress, thereby attenuating the Fenton reaction and the DNA damage that this would cause [PMID:21378183]. The molecular mechanism of action is not known.", "995": "PF16360\nGTP-binding GTPase Middle Region\nThis family locates between the N-terminal domain and MMR_HSR1 50S ribosome-binding GTPase of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This region is unknown for its function.", "996": "PF03737\nAldolase/RraA\nMembers of this family include regulator of ribonuclease E activity A (RraA) and 4-hydroxy-4-methyl-2-oxoglutarate (HMG)/4-carboxy- 4-hydroxy-2-oxoadipate (CHA) aldolase, also known as RraA-like protein [PMID:24359411]. RraA acts as a trans-acting modulator of RNA turnover, binding essential endonuclease RNase E and inhibiting RNA processing [PMID:14499605]. RraA-like proteins seem to contain aldolase and/or decarboxylase activity either in place of or in addition to the RNase E inhibitor functions [PMID:24359411].", "997": "Ribosomal protein S17", "998": "PF09180\nProlyl-tRNA synthetase, C-terminal\nMembers of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif [PMID:12578991].", "999": "PF16640\nBacterial Ig-like domain (group 3)\nThis family consists of bacterial domains with an Ig-like fold.", "1000": "unk" }, "idx2label": { "0": "PF17802", "1": "PF05013", "2": "PF18821", "3": "PF18075", "4": "PF04039", "5": "PF06130", "6": "PF02654", "7": "PF00163", "8": "PF00719", "9": "PF06165", "10": "PF17921", "11": "PF00376", "12": "PF02466", "13": "PF00237", "14": "PF05649", "15": "PF06094", "16": "PF01369", "17": "PF01649", "18": "PF01975", "19": "PF07503", "20": "PF00731", "21": "PF03334", "22": "PF01985", "23": "PF11760", "24": "PF01268", "25": "PF04055", "26": "PF08712", "27": "PF00233", "28": "PF02537", "29": "PF04122", "30": "PF03619", "31": "PF08148", "32": "PF02739", "33": "PF03323", "34": "PF04237", "35": "PF10436", "36": "PF06421", "37": "PF04253", "38": "PF01657", "39": "PF02545", "40": "PF02811", "41": "PF00396", "42": "PF00557", "43": "PF07884", "44": "PF00312", "45": "PF03862", "46": "PF01339", "47": "PF17757", "48": "PF09269", "49": "PF03449", "50": "PF02686", "51": "PF02777", "52": "PF01867", "53": "PF17202", "54": "PF01193", "55": "PF01904", "56": "PF08486", "57": "PF06965", "58": "PF14805", "59": "PF16124", "60": "PF14403", "61": "PF02881", "62": "PF13565", "63": "PF02625", "64": "PF02436", "65": "PF13793", "66": "PF08393", "67": "PF00200", "68": "PF07804", "69": "PF02383", "70": "PF08299", "71": "PF06580", "72": "PF00484", "73": "PF02152", "74": "PF16353", "75": "PF04892", "76": "PF17762", "77": "PF06724", "78": "PF17853", "79": "PF00515", "80": "PF00278", "81": "PF14508", "82": "PF02817", "83": "PF06719", "84": "PF03098", "85": "PF02563", "86": "PF04519", "87": "PF05383", "88": "PF01253", "89": "PF15901", "90": "PF04675", "91": "PF00490", "92": "PF04316", "93": "PF03613", "94": "PF12399", "95": "PF01250", "96": "PF16123", "97": "PF01484", "98": "PF02547", "99": "PF05201", "100": "PF02632", "101": "PF01226", "102": "PF01237", "103": "PF08338", "104": "PF01368", "105": "PF17912", "106": "PF00988", "107": "PF04402", "108": "PF10431", "109": "PF02953", "110": "PF00353", "111": "PF13732", "112": "PF01386", "113": "PF00381", "114": "PF01197", "115": "PF02887", "116": "PF03952", "117": "PF03733", "118": "PF00766", "119": "PF04296", "120": "PF00858", "121": "PF04371", "122": "PF14905", "123": "PF01799", "124": "PF03457", "125": "PF00100", "126": "PF01116", "127": "PF04367", "128": "PF03015", "129": "PF04205", "130": "PF01281", "131": "PF14450", "132": "PF00709", "133": "PF13207", "134": "PF03116", "135": "PF05960", "136": "PF00281", "137": "PF03776", "138": "PF05226", "139": "PF17871", "140": "PF00687", "141": "PF06245", "142": "PF02646", "143": "PF16863", "144": "PF02583", "145": "PF01556", "146": "PF06686", "147": "PF01035", "148": "PF02618", "149": "PF13517", "150": "PF13537", "151": "PF16859", "152": "PF00327", "153": "PF16199", "154": "PF00586", "155": "PF04043", "156": "PF02260", "157": "PF01957", "158": "PF00288", "159": "PF08542", "160": "PF02457", "161": "PF07332", "162": "PF13408", "163": "PF01970", "164": "PF07538", "165": "PF04973", "166": "PF04003", "167": "PF01642", "168": "PF05191", "169": "PF07703", "170": "PF13288", "171": "PF13561", "172": "PF03947", "173": "PF06026", "174": "PF05552", "175": "PF03443", "176": "PF02601", "177": "PF07486", "178": "PF01813", "179": "PF02491", "180": "PF04377", "181": "PF09362", "182": "PF00734", "183": "PF13475", "184": "PF12874", "185": "PF17764", "186": "PF03719", "187": "PF00885", "188": "PF05257", "189": "PF03315", "190": "PF01264", "191": "PF01255", "192": "PF00013", "193": "PF00241", "194": "PF04117", "195": "PF02391", "196": "PF01139", "197": "PF02673", "198": "PF02445", "199": "PF13445", "200": "PF13472", "201": "PF13850", "202": "PF04024", "203": "PF04390", "204": "PF16653", "205": "PF00468", "206": "PF05161", "207": "PF01149", "208": "PF11799", "209": "PF12344", "210": "PF01302", "211": "PF04810", "212": "PF09723", "213": "PF16016", "214": "PF16901", "215": "PF03830", "216": "PF00037", "217": "PF00771", "218": "PF08766", "219": "PF12775", "220": "PF00902", "221": "PF03479", "222": "PF14278", "223": "PF10150", "224": "PF01702", "225": "PF00238", "226": "PF02322", "227": "PF03740", "228": "PF16488", "229": "PF09397", "230": "PF01221", "231": "PF08310", "232": "PF02844", "233": "PF02518", "234": "PF03484", "235": "PF00813", "236": "PF00593", "237": "PF00300", "238": "PF02609", "239": "PF01798", "240": "PF00936", "241": "PF00252", "242": "PF17676", "243": "PF04229", "244": "PF13570", "245": "PF02787", "246": "PF01510", "247": "PF07971", "248": "PF06172", "249": "PF02582", "250": "PF06071", "251": "PF03797", "252": "PF10135", "253": "PF01722", "254": "PF12804", "255": "PF14319", "256": "PF03796", "257": "PF17940", "258": "PF00338", "259": "PF00673", "260": "PF00149", "261": "PF00699", "262": "PF03466", "263": "PF17803", "264": "PF02277", "265": "PF13305", "266": "PF04413", "267": "PF00019", "268": "PF02367", "269": "PF01790", "270": "PF04515", "271": "PF13634", "272": "PF13340", "273": "PF02571", "274": "PF05697", "275": "PF00842", "276": "PF02570", "277": "PF07811", "278": "PF01220", "279": "PF00333", "280": "PF00189", "281": "PF16326", "282": "PF05485", "283": "PF06961", "284": "PF07927", "285": "PF02482", "286": "PF14510", "287": "PF02511", "288": "PF00887", "289": "PF01239", "290": "PF01367", "291": "PF12911", "292": "PF13004", "293": "PF13559", "294": "PF02194", "295": "PF01944", "296": "PF02569", "297": "PF00745", "298": "PF17678", "299": "PF03028", "300": "PF06912", "301": "PF00681", "302": "PF01336", "303": "PF10017", "304": "PF00146", "305": "PF17910", "306": "PF04961", "307": "PF13802", "308": "PF01259", "309": "PF13742", "310": "PF02771", "311": "PF02690", "312": "PF01161", "313": "PF03705", "314": "PF10728", "315": "PF05198", "316": "PF12464", "317": "PF02978", "318": "PF18074", "319": "PF07497", "320": "PF04430", "321": "PF03124", "322": "PF03372", "323": "PF01219", "324": "PF04298", "325": "PF00115", "326": "PF16220", "327": "PF00624", "328": "PF14791", "329": "PF03134", "330": "PF04815", "331": "PF02863", "332": "PF03609", "333": "PF17852", "334": "PF01687", "335": "PF03948", "336": "PF08669", "337": "PF01832", "338": "PF01168", "339": "PF02244", "340": "PF07004", "341": "PF02607", "342": "PF03748", "343": "PF02503", "344": "PF02592", "345": "PF06422", "346": "PF14031", "347": "PF05494", "348": "PF01040", "349": "PF04327", "350": "PF09527", "351": "PF13280", "352": "PF09719", "353": "PF02965", "354": "PF17759", "355": "PF00830", "356": "PF02502", "357": "PF03914", "358": "PF13597", "359": "PF02660", "360": "PF00330", "361": "PF00662", "362": "PF08487", "363": "PF13959", "364": "PF00343", "365": "PF01741", "366": "PF03808", "367": "PF16177", "368": "PF01195", "369": "PF02559", "370": "PF02548", "371": "PF01196", "372": "PF14490", "373": "PF03727", "374": "PF04408", "375": "PF04255", "376": "PF04015", "377": "PF13396", "378": "PF04461", "379": "PF01679", "380": "PF14815", "381": "PF02207", "382": "PF01535", "383": "PF13545", "384": "PF03073", "385": "PF01940", "386": "PF10589", "387": "PF00571", "388": "PF02225", "389": "PF02138", "390": "PF13774", "391": "PF05025", "392": "PF00162", "393": "PF00393", "394": "PF08494", "395": "PF13356", "396": "PF00194", "397": "PF02880", "398": "PF01430", "399": "PF02424", "400": "PF08522", "401": "PF16212", "402": "PF13234", "403": "PF00617", "404": "PF11929", "405": "PF02572", "406": "PF02733", "407": "PF02843", "408": "PF09383", "409": "PF09118", "410": "PF06441", "411": "PF02132", "412": "PF04982", "413": "PF01066", "414": "PF13376", "415": "PF01106", "416": "PF00920", "417": "PF07735", "418": "PF07664", "419": "PF00572", "420": "PF12937", "421": "PF10385", "422": "PF02590", "423": "PF01018", "424": "PF02538", "425": "PF00231", "426": "PF17876", "427": "PF08044", "428": "PF02729", "429": "PF01392", "430": "PF02657", "431": "PF01027", "432": "PF02405", "433": "PF00253", "434": "PF08345", "435": "PF01329", "436": "PF00883", "437": "PF18759", "438": "PF10369", "439": "PF13920", "440": "PF01730", "441": "PF05949", "442": "PF02698", "443": "PF13508", "444": "PF04248", "445": "PF01926", "446": "PF17042", "447": "PF00380", "448": "PF00344", "449": "PF13277", "450": "PF04960", "451": "PF09989", "452": "PF04860", "453": "PF03938", "454": "PF08436", "455": "PF01165", "456": "PF18766", "457": "PF00888", "458": "PF00476", "459": "PF09363", "460": "PF13927", "461": "PF00006", "462": "PF10035", "463": "PF03747", "464": "PF12806", "465": "PF01618", "466": "PF18198", "467": "PF00611", "468": "PF02375", "469": "PF02256", "470": "PF01357", "471": "PF00348", "472": "PF00391", "473": "PF02410", "474": "PF00400", "475": "PF01043", "476": "PF01773", "477": "PF09365", "478": "PF03588", "479": "PF04066", "480": "PF00815", "481": "PF00397", "482": "PF03489", "483": "PF03458", "484": "PF01176", "485": "PF11975", "486": "PF02643", "487": "PF07261", "488": "PF01765", "489": "PF03645", "490": "PF03618", "491": "PF13682", "492": "PF14416", "493": "PF02020", "494": "PF10646", "495": "PF01634", "496": "PF01052", "497": "PF18803", "498": "PF05598", "499": "PF00755", "500": "PF02308", "501": "PF04536", "502": "PF01817", "503": "PF07479", "504": "PF06541", "505": "PF03352", "506": "PF01992", "507": "PF03481", "508": "PF03030", "509": "PF01652", "510": "PF03626", "511": "PF05235", "512": "PF09371", "513": "PF01311", "514": "PF00365", "515": "PF16193", "516": "PF04087", "517": "PF06050", "518": "PF09285", "519": "PF02517", "520": "PF10442", "521": "PF12878", "522": "PF00472", "523": "PF12704", "524": "PF00104", "525": "PF10557", "526": "PF16188", "527": "PF00140", "528": "PF12686", "529": "PF03140", "530": "PF01450", "531": "PF08516", "532": "PF04366", "533": "PF01820", "534": "PF00560", "535": "PF06477", "536": "PF04092", "537": "PF10509", "538": "PF06968", "539": "PF16886", "540": "PF00499", "541": "PF13556", "542": "PF07907", "543": "PF03331", "544": "PF02415", "545": "PF04314", "546": "PF02594", "547": "PF01706", "548": "PF14226", "549": "PF02773", "550": "PF01169", "551": "PF08711", "552": "PF04241", "553": "PF00314", "554": "PF04294", "555": "PF00995", "556": "PF00137", "557": "PF17137", "558": "PF08029", "559": "PF02397", "560": "PF02671", "561": "PF04002", "562": "PF01384", "563": "PF05402", "564": "PF13012", "565": "PF01888", "566": "PF01055", "567": "PF03150", "568": "PF04127", "569": "PF07264", "570": "PF00177", "571": "PF16325", "572": "PF17827", "573": "PF13490", "574": "PF01245", "575": "PF12697", "576": "PF18199", "577": "PF07593", "578": "PF17768", "579": "PF02021", "580": "PF14497", "581": "PF02575", "582": "PF00646", "583": "PF03561", "584": "PF10590", "585": "PF05504", "586": "PF12396", "587": "PF01725", "588": "PF01502", "589": "PF01455", "590": "PF00576", "591": "PF13493", "592": "PF12781", "593": "PF13089", "594": "PF01155", "595": "PF04264", "596": "PF16320", "597": "PF12680", "598": "PF07963", "599": "PF00271", "600": "PF10825", "601": "PF00119", "602": "PF00433", "603": "PF03473", "604": "PF02775", "605": "PF03746", "606": "PF02446", "607": "PF01313", "608": "PF03861", "609": "PF02934", "610": "PF03461", "611": "PF01330", "612": "PF02033", "613": "PF00040", "614": "PF04299", "615": "PF00298", "616": "PF02669", "617": "PF01924", "618": "PF07549", "619": "PF04051", "620": "PF17941", "621": "PF05635", "622": "PF02595", "623": "PF03462", "624": "PF12661", "625": "PF01590", "626": "PF04463", "627": "PF02677", "628": "PF10415", "629": "PF07719", "630": "PF01379", "631": "PF14237", "632": "PF13368", "633": "PF01546", "634": "PF09754", "635": "PF03120", "636": "PF13405", "637": "PF00329", "638": "PF02603", "639": "PF14842", "640": "PF01424", "641": "PF01132", "642": "PF13399", "643": "PF01977", "644": "PF00438", "645": "PF02542", "646": "PF03367", "647": "PF03381", "648": "PF12002", "649": "PF05336", "650": "PF02626", "651": "PF16355", "652": "PF03595", "653": "PF00276", "654": "PF16658", "655": "PF02820", "656": "PF00475", "657": "PF00003", "658": "PF05524", "659": "PF08495", "660": "PF11987", "661": "PF09347", "662": "PF04168", "663": "PF00020", "664": "PF02508", "665": "PF17146", "666": "PF02578", "667": "PF00828", "668": "PF02016", "669": "PF00221", "670": "PF14748", "671": "PF02622", "672": "PF04472", "673": "PF14310", "674": "PF18072", "675": "PF01458", "676": "PF03636", "677": "PF01156", "678": "PF00684", "679": "PF13428", "680": "PF04172", "681": "PF07873", "682": "PF10410", "683": "PF08439", "684": "PF02245", "685": "PF16870", "686": "PF02650", "687": "PF01967", "688": "PF01887", "689": "PF00447", "690": "PF02873", "691": "PF01523", "692": "PF13449", "693": "PF00542", "694": "PF16192", "695": "PF01981", "696": "PF13649", "697": "PF00926", "698": "PF18741", "699": "PF14841", "700": "PF02201", "701": "PF03928", "702": "PF06133", "703": "PF12833", "704": "PF01960", "705": "PF09479", "706": "PF04403", "707": "PF04085", "708": "PF03483", "709": "PF14698", "710": "PF01179", "711": "PF17763", "712": "PF13167", "713": "PF00831", "714": "PF00203", "715": "PF01871", "716": "PF17801", "717": "PF01288", "718": "PF06750", "719": "PF02514", "720": "PF16189", "721": "PF00023", "722": "PF04325", "723": "PF02990", "724": "PF17755", "725": "PF04551", "726": "PF01088", "727": "PF02833", "728": "PF06949", "729": "PF02104", "730": "PF13335", "731": "PF07556", "732": "PF09827", "733": "PF04376", "734": "PF05195", "735": "PF00478", "736": "PF00358", "737": "PF04145", "738": "PF01632", "739": "PF13378", "740": "PF07516", "741": "PF13660", "742": "PF01514", "743": "PF17957", "744": "PF01628", "745": "PF14008", "746": "PF00886", "747": "PF03186", "748": "PF02742", "749": "PF04304", "750": "PF02561", "751": "PF01016", "752": "PF16875", "753": "PF04020", "754": "PF00880", "755": "PF01312", "756": "PF01923", "757": "PF08331", "758": "PF06628", "759": "PF01504", "760": "PF02096", "761": "PF03937", "762": "PF01825", "763": "PF01346", "764": "PF12951", "765": "PF11838", "766": "PF13676", "767": "PF00479", "768": "PF00829", "769": "PF02633", "770": "PF13975", "771": "PF13640", "772": "PF00250", "773": "PF18076", "774": "PF03840", "775": "PF00309", "776": "PF06022", "777": "PF02699", "778": "PF07494", "779": "PF04199", "780": "PF18758", "781": "PF03975", "782": "PF01416", "783": "PF05792", "784": "PF12392", "785": "PF08529", "786": "PF01899", "787": "PF01625", "788": "PF02417", "789": "PF07525", "790": "PF05033", "791": "PF10397", "792": "PF03707", "793": "PF10518", "794": "PF08459", "795": "PF08592", "796": "PF03600", "797": "PF02912", "798": "PF00174", "799": "PF03625", "800": "PF12631", "801": "PF10601", "802": "PF01641", "803": "PF04032", "804": "PF04166", "805": "PF11915", "806": "PF00584", "807": "PF18052", "808": "PF00268", "809": "PF13439", "810": "PF02261", "811": "PF07743", "812": "PF08376", "813": "PF02812", "814": "PF02929", "815": "PF03990", "816": "PF00095", "817": "PF04613", "818": "PF01352", "819": "PF03993", "820": "PF04963", "821": "PF02365", "822": "PF01774", "823": "PF03441", "824": "PF05658", "825": "PF02092", "826": "PF05173", "827": "PF03453", "828": "PF13677", "829": "PF13460", "830": "PF06418", "831": "PF00349", "832": "PF13894", "833": "PF01266", "834": "PF01084", "835": "PF03775", "836": "PF13720", "837": "PF00471", "838": "PF02738", "839": "PF01808", "840": "PF04427", "841": "PF14693", "842": "PF00401", "843": "PF18345", "844": "PF00285", "845": "PF17136", "846": "PF02700", "847": "PF06201", "848": "PF04749", "849": "PF00707", "850": "PF06429", "851": "PF00507", "852": "PF17963", "853": "PF04468", "854": "PF10576", "855": "PF01925", "856": "PF03478", "857": "PF01894", "858": "PF16321", "859": "PF01388", "860": "PF07075", "861": "PF00126", "862": "PF00908", "863": "PF01805", "864": "PF01252", "865": "PF06415", "866": "PF02401", "867": "PF05681", "868": "PF03880", "869": "PF10588", "870": "PF02233", "871": "PF03255", "872": "PF13976", "873": "PF04333", "874": "PF02574", "875": "PF02772", "876": "PF07331", "877": "PF00303", "878": "PF00346", "879": "PF00453", "880": "PF16916", "881": "PF01192", "882": "PF05184", "883": "PF08378", "884": "PF12019", "885": "PF10143", "886": "PF01988", "887": "PF03788", "888": "PF00486", "889": "PF00352", "890": "PF14714", "891": "PF02805", "892": "PF13382", "893": "PF16486", "894": "PF04060", "895": "PF07500", "896": "PF02885", "897": "PF10396", "898": "PF13478", "899": "PF03988", "900": "PF14804", "901": "PF01769", "902": "PF00596", "903": "PF01428", "904": "PF09995", "905": "PF01809", "906": "PF00199", "907": "PF00410", "908": "PF01715", "909": "PF00759", "910": "PF00986", "911": "PF06689", "912": "PF01804", "913": "PF09298", "914": "PF10996", "915": "PF13692", "916": "PF00925", "917": "PF17919", "918": "PF03119", "919": "PF06803", "920": "PF02910", "921": "PF03799", "922": "PF02586", "923": "PF12673", "924": "PF01693", "925": "PF08742", "926": "PF01987", "927": "PF16344", "928": "PF07715", "929": "PF00763", "930": "PF04000", "931": "PF07501", "932": "PF04295", "933": "PF03755", "934": "PF01713", "935": "PF01724", "936": "PF01205", "937": "PF02615", "938": "PF02580", "939": "PF01668", "940": "PF01990", "941": "PF02325", "942": "PF04547", "943": "PF16113", "944": "PF17862", "945": "PF14667", "946": "PF16874", "947": "PF03591", "948": "PF02781", "949": "PF08818", "950": "PF00036", "951": "PF00742", "952": "PF13620", "953": "PF02985", "954": "PF08497", "955": "PF04234", "956": "PF17517", "957": "PF04149", "958": "PF12662", "959": "PF07562", "960": "PF09994", "961": "PF16491", "962": "PF03929", "963": "PF07662", "964": "PF00984", "965": "PF04777", "966": "PF03091", "967": "PF09369", "968": "PF14420", "969": "PF04898", "970": "PF08379", "971": "PF01654", "972": "PF08340", "973": "PF01297", "974": "PF03946", "975": "PF03055", "976": "PF01709", "977": "PF02730", "978": "PF03729", "979": "PF01969", "980": "PF03313", "981": "PF05164", "982": "PF00677", "983": "PF00367", "984": "PF01980", "985": "PF00080", "986": "PF00318", "987": "PF05192", "988": "PF13244", "989": "PF13932", "990": "PF02465", "991": "PF09851", "992": "PF01890", "993": "PF00387", "994": "PF03883", "995": "PF16360", "996": "PF03737", "997": "PF00366", "998": "PF09180", "999": "PF16640" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "PF00003": 657, "PF00006": 461, "PF00013": 192, "PF00019": 267, "PF00020": 663, "PF00023": 721, "PF00036": 950, "PF00037": 216, "PF00040": 613, "PF00080": 985, "PF00095": 816, "PF00100": 125, "PF00104": 524, "PF00115": 325, "PF00119": 601, "PF00126": 861, "PF00137": 556, "PF00140": 527, "PF00146": 304, "PF00149": 260, "PF00162": 392, "PF00163": 7, "PF00174": 798, "PF00177": 570, "PF00189": 280, "PF00194": 396, "PF00199": 906, "PF00200": 67, "PF00203": 714, "PF00221": 669, "PF00231": 425, "PF00233": 27, "PF00237": 13, "PF00238": 225, "PF00241": 193, "PF00250": 772, "PF00252": 241, "PF00253": 433, "PF00268": 808, "PF00271": 599, "PF00276": 653, "PF00278": 80, "PF00281": 136, "PF00285": 844, "PF00288": 158, "PF00298": 615, "PF00300": 237, "PF00303": 877, "PF00309": 775, "PF00312": 44, "PF00314": 553, "PF00318": 986, "PF00327": 152, "PF00329": 637, "PF00330": 360, "PF00333": 279, "PF00338": 258, "PF00343": 364, "PF00344": 448, "PF00346": 878, "PF00348": 471, "PF00349": 831, "PF00352": 889, "PF00353": 110, "PF00358": 736, "PF00365": 514, "PF00366": 997, "PF00367": 983, "PF00376": 11, "PF00380": 447, "PF00381": 113, "PF00387": 993, "PF00391": 472, "PF00393": 393, "PF00396": 41, "PF00397": 481, "PF00400": 474, "PF00401": 842, "PF00410": 907, "PF00433": 602, "PF00438": 644, "PF00447": 689, "PF00453": 879, "PF00468": 205, "PF00471": 837, "PF00472": 522, "PF00475": 656, "PF00476": 458, "PF00478": 735, "PF00479": 767, "PF00484": 72, "PF00486": 888, "PF00490": 91, "PF00499": 540, "PF00507": 851, "PF00515": 79, "PF00542": 693, "PF00557": 42, "PF00560": 534, "PF00571": 387, "PF00572": 419, "PF00576": 590, "PF00584": 806, "PF00586": 154, "PF00593": 236, "PF00596": 902, "PF00611": 467, "PF00617": 403, "PF00624": 327, "PF00646": 582, "PF00662": 361, "PF00673": 259, "PF00677": 982, "PF00681": 301, "PF00684": 678, "PF00687": 140, "PF00699": 261, "PF00707": 849, "PF00709": 132, "PF00719": 8, "PF00731": 20, "PF00734": 182, "PF00742": 951, "PF00745": 297, "PF00755": 499, "PF00759": 909, "PF00763": 929, "PF00766": 118, "PF00771": 217, "PF00813": 235, "PF00815": 480, "PF00828": 667, "PF00829": 768, "PF00830": 355, "PF00831": 713, "PF00842": 275, "PF00858": 120, "PF00880": 754, "PF00883": 436, "PF00885": 187, "PF00886": 746, "PF00887": 288, "PF00888": 457, "PF00902": 220, "PF00908": 862, "PF00920": 416, "PF00925": 916, "PF00926": 697, "PF00936": 240, "PF00984": 964, "PF00986": 910, "PF00988": 106, "PF00995": 555, "PF01016": 751, "PF01018": 423, "PF01027": 431, "PF01035": 147, "PF01040": 348, "PF01043": 475, "PF01052": 496, "PF01055": 566, "PF01066": 413, "PF01084": 834, "PF01088": 726, "PF01106": 415, "PF01116": 126, "PF01132": 641, "PF01139": 196, "PF01149": 207, "PF01155": 594, "PF01156": 677, "PF01161": 312, "PF01165": 455, "PF01168": 338, "PF01169": 550, "PF01176": 484, "PF01179": 710, "PF01192": 881, "PF01193": 54, "PF01195": 368, "PF01196": 371, "PF01197": 114, "PF01205": 936, "PF01219": 323, "PF01220": 278, "PF01221": 230, "PF01226": 101, "PF01237": 102, "PF01239": 289, "PF01245": 574, "PF01250": 95, "PF01252": 864, "PF01253": 88, "PF01255": 191, "PF01259": 308, "PF01264": 190, "PF01266": 833, "PF01268": 24, "PF01281": 130, "PF01288": 717, "PF01297": 973, "PF01302": 210, "PF01311": 513, "PF01312": 755, "PF01313": 607, "PF01329": 435, "PF01330": 611, "PF01336": 302, "PF01339": 46, "PF01346": 763, "PF01352": 818, "PF01357": 470, "PF01367": 290, "PF01368": 104, "PF01369": 16, "PF01379": 630, "PF01384": 562, "PF01386": 112, "PF01388": 859, "PF01392": 429, "PF01416": 782, "PF01424": 640, "PF01428": 903, "PF01430": 398, "PF01450": 530, "PF01455": 589, "PF01458": 675, "PF01484": 97, "PF01502": 588, "PF01504": 759, "PF01510": 246, "PF01514": 742, "PF01523": 691, "PF01535": 382, "PF01546": 633, "PF01556": 145, "PF01590": 625, "PF01618": 465, "PF01625": 787, "PF01628": 744, "PF01632": 738, "PF01634": 495, "PF01641": 802, "PF01642": 167, "PF01649": 17, "PF01652": 509, "PF01654": 971, "PF01657": 38, "PF01668": 939, "PF01679": 379, "PF01687": 334, "PF01693": 924, "PF01702": 224, "PF01706": 547, "PF01709": 976, "PF01713": 934, "PF01715": 908, "PF01722": 253, "PF01724": 935, "PF01725": 587, "PF01730": 440, "PF01741": 365, "PF01765": 488, "PF01769": 901, "PF01773": 476, "PF01774": 822, "PF01790": 269, "PF01798": 239, "PF01799": 123, "PF01804": 912, "PF01805": 863, "PF01808": 839, "PF01809": 905, "PF01813": 178, "PF01817": 502, "PF01820": 533, "PF01825": 762, "PF01832": 337, "PF01867": 52, "PF01871": 715, "PF01887": 688, "PF01888": 565, "PF01890": 992, "PF01894": 857, "PF01899": 786, "PF01904": 55, "PF01923": 756, "PF01924": 617, "PF01925": 855, "PF01926": 445, "PF01940": 385, "PF01944": 295, "PF01957": 157, "PF01960": 704, "PF01967": 687, "PF01969": 979, "PF01970": 163, "PF01975": 18, "PF01977": 643, "PF01980": 984, "PF01981": 695, "PF01985": 22, "PF01987": 926, "PF01988": 886, "PF01990": 940, "PF01992": 506, "PF02016": 668, "PF02020": 493, "PF02021": 579, "PF02033": 612, "PF02092": 825, "PF02096": 760, "PF02104": 729, "PF02132": 411, "PF02138": 389, "PF02152": 73, "PF02194": 294, "PF02201": 700, "PF02207": 381, "PF02225": 388, "PF02233": 870, "PF02244": 339, "PF02245": 684, "PF02256": 469, "PF02260": 156, "PF02261": 810, "PF02277": 264, "PF02308": 500, "PF02322": 226, "PF02325": 941, "PF02365": 821, "PF02367": 268, "PF02375": 468, "PF02383": 69, "PF02391": 195, "PF02397": 559, "PF02401": 866, "PF02405": 432, "PF02410": 473, "PF02415": 544, "PF02417": 788, "PF02424": 399, "PF02436": 64, "PF02445": 198, "PF02446": 606, "PF02457": 160, "PF02465": 990, "PF02466": 12, "PF02482": 285, "PF02491": 179, "PF02502": 356, "PF02503": 343, "PF02508": 664, "PF02511": 287, "PF02514": 719, "PF02517": 519, "PF02518": 233, "PF02537": 28, "PF02538": 424, "PF02542": 645, "PF02545": 39, "PF02547": 98, "PF02548": 370, "PF02559": 369, "PF02561": 750, "PF02563": 85, "PF02569": 296, "PF02570": 276, "PF02571": 273, "PF02572": 405, "PF02574": 874, "PF02575": 581, "PF02578": 666, "PF02580": 938, "PF02582": 249, "PF02583": 144, "PF02586": 922, "PF02590": 422, "PF02592": 344, "PF02594": 546, "PF02595": 622, "PF02601": 176, "PF02603": 638, "PF02607": 341, "PF02609": 238, "PF02615": 937, "PF02618": 148, "PF02622": 671, "PF02625": 63, "PF02626": 650, "PF02632": 100, "PF02633": 769, "PF02643": 486, "PF02646": 142, "PF02650": 686, "PF02654": 6, "PF02657": 430, "PF02660": 359, "PF02669": 616, "PF02671": 560, "PF02673": 197, "PF02677": 627, "PF02686": 50, "PF02690": 311, "PF02698": 442, "PF02699": 777, "PF02700": 846, "PF02729": 428, "PF02730": 977, "PF02733": 406, "PF02738": 838, "PF02739": 32, "PF02742": 748, "PF02771": 310, "PF02772": 875, "PF02773": 549, "PF02775": 604, "PF02777": 51, "PF02781": 948, "PF02787": 245, "PF02805": 891, "PF02811": 40, "PF02812": 813, "PF02817": 82, "PF02820": 655, "PF02833": 727, "PF02843": 407, "PF02844": 232, "PF02863": 331, "PF02873": 690, "PF02880": 397, "PF02881": 61, "PF02885": 896, "PF02887": 115, "PF02910": 920, "PF02912": 797, "PF02929": 814, "PF02934": 609, "PF02953": 109, "PF02965": 353, "PF02978": 317, "PF02985": 953, "PF02990": 723, "PF03015": 128, "PF03028": 299, "PF03030": 508, "PF03055": 975, "PF03073": 384, "PF03091": 966, "PF03098": 84, "PF03116": 134, "PF03119": 918, "PF03120": 635, "PF03124": 321, "PF03134": 329, "PF03140": 529, "PF03150": 567, "PF03186": 747, "PF03255": 871, "PF03313": 980, "PF03315": 189, "PF03323": 33, "PF03331": 543, "PF03334": 21, "PF03352": 505, "PF03367": 646, "PF03372": 322, "PF03381": 647, "PF03441": 823, "PF03443": 175, "PF03449": 49, "PF03453": 827, "PF03457": 124, "PF03458": 483, "PF03461": 610, "PF03462": 623, "PF03466": 262, "PF03473": 603, "PF03478": 856, "PF03479": 221, "PF03481": 507, "PF03483": 708, "PF03484": 234, "PF03489": 482, "PF03561": 583, "PF03588": 478, "PF03591": 947, "PF03595": 652, "PF03600": 796, "PF03609": 332, "PF03613": 93, "PF03618": 490, "PF03619": 30, "PF03625": 799, "PF03626": 510, "PF03636": 676, "PF03645": 489, "PF03705": 313, "PF03707": 792, "PF03719": 186, "PF03727": 373, "PF03729": 978, "PF03733": 117, "PF03737": 996, "PF03740": 227, "PF03746": 605, "PF03747": 463, "PF03748": 342, "PF03755": 933, "PF03775": 835, "PF03776": 137, "PF03788": 887, "PF03796": 256, "PF03797": 251, "PF03799": 921, "PF03808": 366, "PF03830": 215, "PF03840": 774, "PF03861": 608, "PF03862": 45, "PF03880": 868, "PF03883": 994, "PF03914": 357, "PF03928": 701, "PF03929": 962, "PF03937": 761, "PF03938": 453, "PF03946": 974, "PF03947": 172, "PF03948": 335, "PF03952": 116, "PF03975": 781, "PF03988": 899, "PF03990": 815, "PF03993": 819, "PF04000": 930, "PF04002": 561, "PF04003": 166, "PF04015": 376, "PF04020": 753, "PF04024": 202, "PF04032": 803, "PF04039": 4, "PF04043": 155, "PF04051": 619, "PF04055": 25, "PF04060": 894, "PF04066": 479, "PF04085": 707, "PF04087": 516, "PF04092": 536, "PF04117": 194, "PF04122": 29, "PF04127": 568, "PF04145": 737, "PF04149": 957, "PF04166": 804, "PF04168": 662, "PF04172": 680, "PF04199": 779, "PF04205": 129, "PF04229": 243, "PF04234": 955, "PF04237": 34, "PF04241": 552, "PF04248": 444, "PF04253": 37, "PF04255": 375, "PF04264": 595, "PF04294": 554, "PF04295": 932, "PF04296": 119, "PF04298": 324, "PF04299": 614, "PF04304": 749, "PF04314": 545, "PF04316": 92, "PF04325": 722, "PF04327": 349, "PF04333": 873, "PF04366": 532, "PF04367": 127, "PF04371": 121, "PF04376": 733, "PF04377": 180, "PF04390": 203, "PF04402": 107, "PF04403": 706, "PF04408": 374, "PF04413": 266, "PF04427": 840, "PF04430": 320, "PF04461": 378, "PF04463": 626, "PF04468": 853, "PF04472": 672, "PF04515": 270, "PF04519": 86, "PF04536": 501, "PF04547": 942, "PF04551": 725, "PF04613": 817, "PF04675": 90, "PF04749": 848, "PF04777": 965, "PF04810": 211, "PF04815": 330, "PF04860": 452, "PF04892": 75, "PF04898": 969, "PF04960": 450, "PF04961": 306, "PF04963": 820, "PF04973": 165, "PF04982": 412, "PF05013": 1, "PF05025": 391, "PF05033": 790, "PF05161": 206, "PF05164": 981, "PF05173": 826, "PF05184": 882, "PF05191": 168, "PF05192": 987, "PF05195": 734, "PF05198": 315, "PF05201": 99, "PF05226": 138, "PF05235": 511, "PF05257": 188, "PF05336": 649, "PF05383": 87, "PF05402": 563, "PF05485": 282, "PF05494": 347, "PF05504": 585, "PF05524": 658, "PF05552": 174, "PF05598": 498, "PF05635": 621, "PF05649": 14, "PF05658": 824, "PF05681": 867, "PF05697": 274, "PF05792": 783, "PF05949": 441, "PF05960": 135, "PF06022": 776, "PF06026": 173, "PF06050": 517, "PF06071": 250, "PF06094": 15, "PF06130": 5, "PF06133": 702, "PF06165": 9, "PF06172": 248, "PF06201": 847, "PF06245": 141, "PF06415": 865, "PF06418": 830, "PF06421": 36, "PF06422": 345, "PF06429": 850, "PF06441": 410, "PF06477": 535, "PF06541": 504, "PF06580": 71, "PF06628": 758, "PF06686": 146, "PF06689": 911, "PF06719": 83, "PF06724": 77, "PF06750": 718, "PF06803": 919, "PF06912": 300, "PF06949": 728, "PF06961": 283, "PF06965": 57, "PF06968": 538, "PF07004": 340, "PF07075": 860, "PF07261": 487, "PF07264": 569, "PF07331": 876, "PF07332": 161, "PF07479": 503, "PF07486": 177, "PF07494": 778, "PF07497": 319, "PF07500": 895, "PF07501": 931, "PF07503": 19, "PF07516": 740, "PF07525": 789, "PF07538": 164, "PF07549": 618, "PF07556": 731, "PF07562": 959, "PF07593": 577, "PF07662": 963, "PF07664": 418, "PF07703": 169, "PF07715": 928, "PF07719": 629, "PF07735": 417, "PF07743": 811, "PF07804": 68, "PF07811": 277, "PF07873": 681, "PF07884": 43, "PF07907": 542, "PF07927": 284, "PF07963": 598, "PF07971": 247, "PF08029": 558, "PF08044": 427, "PF08148": 31, "PF08299": 70, "PF08310": 231, "PF08331": 757, "PF08338": 103, "PF08340": 972, "PF08345": 434, "PF08376": 812, "PF08378": 883, "PF08379": 970, "PF08393": 66, "PF08436": 454, "PF08439": 683, "PF08459": 794, "PF08486": 56, "PF08487": 362, "PF08494": 394, "PF08495": 659, "PF08497": 954, "PF08516": 531, "PF08522": 400, "PF08529": 785, "PF08542": 159, "PF08592": 795, "PF08669": 336, "PF08711": 551, "PF08712": 26, "PF08742": 925, "PF08766": 218, "PF08818": 949, "PF09118": 409, "PF09180": 998, "PF09269": 48, "PF09285": 518, "PF09298": 913, "PF09347": 661, "PF09362": 181, "PF09363": 459, "PF09365": 477, "PF09369": 967, "PF09371": 512, "PF09383": 408, "PF09397": 229, "PF09479": 705, "PF09527": 350, "PF09719": 352, "PF09723": 212, "PF09754": 634, "PF09827": 732, "PF09851": 991, "PF09989": 451, "PF09994": 960, "PF09995": 904, "PF10017": 303, "PF10035": 462, "PF10135": 252, "PF10143": 885, "PF10150": 223, "PF10369": 438, "PF10385": 421, "PF10396": 897, "PF10397": 791, "PF10410": 682, "PF10415": 628, "PF10431": 108, "PF10436": 35, "PF10442": 520, "PF10509": 537, "PF10518": 793, "PF10557": 525, "PF10576": 854, "PF10588": 869, "PF10589": 386, "PF10590": 584, "PF10601": 801, "PF10646": 494, "PF10728": 314, "PF10825": 600, "PF10996": 914, "PF11760": 23, "PF11799": 208, "PF11838": 765, "PF11915": 805, "PF11929": 404, "PF11975": 485, "PF11987": 660, "PF12002": 648, "PF12019": 884, "PF12344": 209, "PF12392": 784, "PF12396": 586, "PF12399": 94, "PF12464": 316, "PF12631": 800, "PF12661": 624, "PF12662": 958, "PF12673": 923, "PF12680": 597, "PF12686": 528, "PF12697": 575, "PF12704": 523, "PF12775": 219, "PF12781": 592, "PF12804": 254, "PF12806": 464, "PF12833": 703, "PF12874": 184, "PF12878": 521, "PF12911": 291, "PF12937": 420, "PF12951": 764, "PF13004": 292, "PF13012": 564, "PF13089": 593, "PF13167": 712, "PF13207": 133, "PF13234": 402, "PF13244": 988, "PF13277": 449, "PF13280": 351, "PF13288": 170, "PF13305": 265, "PF13335": 730, "PF13340": 272, "PF13356": 395, "PF13368": 632, "PF13376": 414, "PF13378": 739, "PF13382": 892, "PF13396": 377, "PF13399": 642, "PF13405": 636, "PF13408": 162, "PF13428": 679, "PF13439": 809, "PF13445": 199, "PF13449": 692, "PF13460": 829, "PF13472": 200, "PF13475": 183, "PF13478": 898, "PF13490": 573, "PF13493": 591, "PF13508": 443, "PF13517": 149, "PF13537": 150, "PF13545": 383, "PF13556": 541, "PF13559": 293, "PF13561": 171, "PF13565": 62, "PF13570": 244, "PF13597": 358, "PF13620": 952, "PF13634": 271, "PF13640": 771, "PF13649": 696, "PF13660": 741, "PF13676": 766, "PF13677": 828, "PF13682": 491, "PF13692": 915, "PF13720": 836, "PF13732": 111, "PF13742": 309, "PF13774": 390, "PF13793": 65, "PF13802": 307, "PF13850": 201, "PF13894": 832, "PF13920": 439, "PF13927": 460, "PF13932": 989, "PF13959": 363, "PF13975": 770, "PF13976": 872, "PF14008": 745, "PF14031": 346, "PF14226": 548, "PF14237": 631, "PF14278": 222, "PF14310": 673, "PF14319": 255, "PF14403": 60, "PF14416": 492, "PF14420": 968, "PF14450": 131, "PF14490": 372, "PF14497": 580, "PF14508": 81, "PF14510": 286, "PF14667": 945, "PF14693": 841, "PF14698": 709, "PF14714": 890, "PF14748": 670, "PF14791": 328, "PF14804": 900, "PF14805": 58, "PF14815": 380, "PF14841": 699, "PF14842": 639, "PF14905": 122, "PF15901": 89, "PF16016": 213, "PF16113": 943, "PF16123": 96, "PF16124": 59, "PF16177": 367, "PF16188": 526, "PF16189": 720, "PF16192": 694, "PF16193": 515, "PF16199": 153, "PF16212": 401, "PF16220": 326, "PF16320": 596, "PF16321": 858, "PF16325": 571, "PF16326": 281, "PF16344": 927, "PF16353": 74, "PF16355": 651, "PF16360": 995, "PF16486": 893, "PF16488": 228, "PF16491": 961, "PF16640": 999, "PF16653": 204, "PF16658": 654, "PF16859": 151, "PF16863": 143, "PF16870": 685, "PF16874": 946, "PF16875": 752, "PF16886": 539, "PF16901": 214, "PF16916": 880, "PF17042": 446, "PF17136": 845, "PF17137": 557, "PF17146": 665, "PF17202": 53, "PF17517": 956, "PF17676": 242, "PF17678": 298, "PF17755": 724, "PF17757": 47, "PF17759": 354, "PF17762": 76, "PF17763": 711, "PF17764": 185, "PF17768": 578, "PF17801": 716, "PF17802": 0, "PF17803": 263, "PF17827": 572, "PF17852": 333, "PF17853": 78, "PF17862": 944, "PF17871": 139, "PF17876": 426, "PF17910": 305, "PF17912": 105, "PF17919": 917, "PF17921": 10, "PF17940": 257, "PF17941": 620, "PF17957": 743, "PF17963": 852, "PF18052": 807, "PF18072": 674, "PF18074": 318, "PF18075": 3, "PF18076": 773, "PF18198": 466, "PF18199": 576, "PF18345": 843, "PF18741": 698, "PF18758": 780, "PF18759": 437, "PF18766": 456, "PF18803": 497, "PF18821": 2 }, "label2idx": { "PF00003": 657, "PF00006": 461, "PF00013": 192, "PF00019": 267, "PF00020": 663, "PF00023": 721, "PF00036": 950, "PF00037": 216, "PF00040": 613, "PF00080": 985, "PF00095": 816, "PF00100": 125, "PF00104": 524, "PF00115": 325, "PF00119": 601, "PF00126": 861, "PF00137": 556, "PF00140": 527, "PF00146": 304, "PF00149": 260, "PF00162": 392, "PF00163": 7, "PF00174": 798, "PF00177": 570, "PF00189": 280, "PF00194": 396, "PF00199": 906, "PF00200": 67, "PF00203": 714, "PF00221": 669, "PF00231": 425, "PF00233": 27, "PF00237": 13, "PF00238": 225, "PF00241": 193, "PF00250": 772, "PF00252": 241, "PF00253": 433, "PF00268": 808, "PF00271": 599, "PF00276": 653, "PF00278": 80, "PF00281": 136, "PF00285": 844, "PF00288": 158, "PF00298": 615, "PF00300": 237, "PF00303": 877, "PF00309": 775, "PF00312": 44, "PF00314": 553, "PF00318": 986, "PF00327": 152, "PF00329": 637, "PF00330": 360, "PF00333": 279, "PF00338": 258, "PF00343": 364, "PF00344": 448, "PF00346": 878, "PF00348": 471, "PF00349": 831, "PF00352": 889, "PF00353": 110, "PF00358": 736, "PF00365": 514, "PF00366": 997, "PF00367": 983, "PF00376": 11, "PF00380": 447, "PF00381": 113, "PF00387": 993, "PF00391": 472, "PF00393": 393, "PF00396": 41, "PF00397": 481, "PF00400": 474, "PF00401": 842, "PF00410": 907, "PF00433": 602, "PF00438": 644, "PF00447": 689, "PF00453": 879, "PF00468": 205, "PF00471": 837, "PF00472": 522, "PF00475": 656, "PF00476": 458, "PF00478": 735, "PF00479": 767, "PF00484": 72, "PF00486": 888, "PF00490": 91, "PF00499": 540, "PF00507": 851, "PF00515": 79, "PF00542": 693, "PF00557": 42, "PF00560": 534, "PF00571": 387, "PF00572": 419, "PF00576": 590, "PF00584": 806, "PF00586": 154, "PF00593": 236, "PF00596": 902, "PF00611": 467, "PF00617": 403, "PF00624": 327, "PF00646": 582, "PF00662": 361, "PF00673": 259, "PF00677": 982, "PF00681": 301, "PF00684": 678, "PF00687": 140, "PF00699": 261, "PF00707": 849, "PF00709": 132, "PF00719": 8, "PF00731": 20, "PF00734": 182, "PF00742": 951, "PF00745": 297, "PF00755": 499, "PF00759": 909, "PF00763": 929, "PF00766": 118, "PF00771": 217, "PF00813": 235, "PF00815": 480, "PF00828": 667, "PF00829": 768, "PF00830": 355, "PF00831": 713, "PF00842": 275, "PF00858": 120, "PF00880": 754, "PF00883": 436, "PF00885": 187, "PF00886": 746, "PF00887": 288, "PF00888": 457, "PF00902": 220, "PF00908": 862, "PF00920": 416, "PF00925": 916, "PF00926": 697, "PF00936": 240, "PF00984": 964, "PF00986": 910, "PF00988": 106, "PF00995": 555, "PF01016": 751, "PF01018": 423, "PF01027": 431, "PF01035": 147, "PF01040": 348, "PF01043": 475, "PF01052": 496, "PF01055": 566, "PF01066": 413, "PF01084": 834, "PF01088": 726, "PF01106": 415, "PF01116": 126, "PF01132": 641, "PF01139": 196, "PF01149": 207, "PF01155": 594, "PF01156": 677, "PF01161": 312, "PF01165": 455, "PF01168": 338, "PF01169": 550, "PF01176": 484, "PF01179": 710, "PF01192": 881, "PF01193": 54, "PF01195": 368, "PF01196": 371, "PF01197": 114, "PF01205": 936, "PF01219": 323, "PF01220": 278, "PF01221": 230, "PF01226": 101, "PF01237": 102, "PF01239": 289, "PF01245": 574, "PF01250": 95, "PF01252": 864, "PF01253": 88, "PF01255": 191, "PF01259": 308, "PF01264": 190, "PF01266": 833, "PF01268": 24, "PF01281": 130, "PF01288": 717, "PF01297": 973, "PF01302": 210, "PF01311": 513, "PF01312": 755, "PF01313": 607, "PF01329": 435, "PF01330": 611, "PF01336": 302, "PF01339": 46, "PF01346": 763, "PF01352": 818, "PF01357": 470, "PF01367": 290, "PF01368": 104, "PF01369": 16, "PF01379": 630, "PF01384": 562, "PF01386": 112, "PF01388": 859, "PF01392": 429, "PF01416": 782, "PF01424": 640, "PF01428": 903, "PF01430": 398, "PF01450": 530, "PF01455": 589, "PF01458": 675, "PF01484": 97, "PF01502": 588, "PF01504": 759, "PF01510": 246, "PF01514": 742, "PF01523": 691, "PF01535": 382, "PF01546": 633, "PF01556": 145, "PF01590": 625, "PF01618": 465, "PF01625": 787, "PF01628": 744, "PF01632": 738, "PF01634": 495, "PF01641": 802, "PF01642": 167, "PF01649": 17, "PF01652": 509, "PF01654": 971, "PF01657": 38, "PF01668": 939, "PF01679": 379, "PF01687": 334, "PF01693": 924, "PF01702": 224, "PF01706": 547, "PF01709": 976, "PF01713": 934, "PF01715": 908, "PF01722": 253, "PF01724": 935, "PF01725": 587, "PF01730": 440, "PF01741": 365, "PF01765": 488, "PF01769": 901, "PF01773": 476, "PF01774": 822, "PF01790": 269, "PF01798": 239, "PF01799": 123, "PF01804": 912, "PF01805": 863, "PF01808": 839, "PF01809": 905, "PF01813": 178, "PF01817": 502, "PF01820": 533, "PF01825": 762, "PF01832": 337, "PF01867": 52, "PF01871": 715, "PF01887": 688, "PF01888": 565, "PF01890": 992, "PF01894": 857, "PF01899": 786, "PF01904": 55, "PF01923": 756, "PF01924": 617, "PF01925": 855, "PF01926": 445, "PF01940": 385, "PF01944": 295, "PF01957": 157, "PF01960": 704, "PF01967": 687, "PF01969": 979, "PF01970": 163, "PF01975": 18, "PF01977": 643, "PF01980": 984, "PF01981": 695, "PF01985": 22, "PF01987": 926, "PF01988": 886, "PF01990": 940, "PF01992": 506, "PF02016": 668, "PF02020": 493, "PF02021": 579, "PF02033": 612, "PF02092": 825, "PF02096": 760, "PF02104": 729, "PF02132": 411, "PF02138": 389, "PF02152": 73, "PF02194": 294, "PF02201": 700, "PF02207": 381, "PF02225": 388, "PF02233": 870, "PF02244": 339, "PF02245": 684, "PF02256": 469, "PF02260": 156, "PF02261": 810, "PF02277": 264, "PF02308": 500, "PF02322": 226, "PF02325": 941, "PF02365": 821, "PF02367": 268, "PF02375": 468, "PF02383": 69, "PF02391": 195, "PF02397": 559, "PF02401": 866, "PF02405": 432, "PF02410": 473, "PF02415": 544, "PF02417": 788, "PF02424": 399, "PF02436": 64, "PF02445": 198, "PF02446": 606, "PF02457": 160, "PF02465": 990, "PF02466": 12, "PF02482": 285, "PF02491": 179, "PF02502": 356, "PF02503": 343, "PF02508": 664, "PF02511": 287, "PF02514": 719, "PF02517": 519, "PF02518": 233, "PF02537": 28, "PF02538": 424, "PF02542": 645, "PF02545": 39, "PF02547": 98, "PF02548": 370, "PF02559": 369, "PF02561": 750, "PF02563": 85, "PF02569": 296, "PF02570": 276, "PF02571": 273, "PF02572": 405, "PF02574": 874, "PF02575": 581, "PF02578": 666, "PF02580": 938, "PF02582": 249, "PF02583": 144, "PF02586": 922, "PF02590": 422, "PF02592": 344, "PF02594": 546, "PF02595": 622, "PF02601": 176, "PF02603": 638, "PF02607": 341, "PF02609": 238, "PF02615": 937, "PF02618": 148, "PF02622": 671, "PF02625": 63, "PF02626": 650, "PF02632": 100, "PF02633": 769, "PF02643": 486, "PF02646": 142, "PF02650": 686, "PF02654": 6, "PF02657": 430, "PF02660": 359, "PF02669": 616, "PF02671": 560, "PF02673": 197, "PF02677": 627, "PF02686": 50, "PF02690": 311, "PF02698": 442, "PF02699": 777, "PF02700": 846, "PF02729": 428, "PF02730": 977, "PF02733": 406, "PF02738": 838, "PF02739": 32, "PF02742": 748, "PF02771": 310, "PF02772": 875, "PF02773": 549, "PF02775": 604, "PF02777": 51, "PF02781": 948, "PF02787": 245, "PF02805": 891, "PF02811": 40, "PF02812": 813, "PF02817": 82, "PF02820": 655, "PF02833": 727, "PF02843": 407, "PF02844": 232, "PF02863": 331, "PF02873": 690, "PF02880": 397, "PF02881": 61, "PF02885": 896, "PF02887": 115, "PF02910": 920, "PF02912": 797, "PF02929": 814, "PF02934": 609, "PF02953": 109, "PF02965": 353, "PF02978": 317, "PF02985": 953, "PF02990": 723, "PF03015": 128, "PF03028": 299, "PF03030": 508, "PF03055": 975, "PF03073": 384, "PF03091": 966, "PF03098": 84, "PF03116": 134, "PF03119": 918, "PF03120": 635, "PF03124": 321, "PF03134": 329, "PF03140": 529, "PF03150": 567, "PF03186": 747, "PF03255": 871, "PF03313": 980, "PF03315": 189, "PF03323": 33, "PF03331": 543, "PF03334": 21, "PF03352": 505, "PF03367": 646, "PF03372": 322, "PF03381": 647, "PF03441": 823, "PF03443": 175, "PF03449": 49, "PF03453": 827, "PF03457": 124, "PF03458": 483, "PF03461": 610, "PF03462": 623, "PF03466": 262, "PF03473": 603, "PF03478": 856, "PF03479": 221, "PF03481": 507, "PF03483": 708, "PF03484": 234, "PF03489": 482, "PF03561": 583, "PF03588": 478, "PF03591": 947, "PF03595": 652, "PF03600": 796, "PF03609": 332, "PF03613": 93, "PF03618": 490, "PF03619": 30, "PF03625": 799, "PF03626": 510, "PF03636": 676, "PF03645": 489, "PF03705": 313, "PF03707": 792, "PF03719": 186, "PF03727": 373, "PF03729": 978, "PF03733": 117, "PF03737": 996, "PF03740": 227, "PF03746": 605, "PF03747": 463, "PF03748": 342, "PF03755": 933, "PF03775": 835, "PF03776": 137, "PF03788": 887, "PF03796": 256, "PF03797": 251, "PF03799": 921, "PF03808": 366, "PF03830": 215, "PF03840": 774, "PF03861": 608, "PF03862": 45, "PF03880": 868, "PF03883": 994, "PF03914": 357, "PF03928": 701, "PF03929": 962, "PF03937": 761, "PF03938": 453, "PF03946": 974, "PF03947": 172, "PF03948": 335, "PF03952": 116, "PF03975": 781, "PF03988": 899, "PF03990": 815, "PF03993": 819, "PF04000": 930, "PF04002": 561, "PF04003": 166, "PF04015": 376, "PF04020": 753, "PF04024": 202, "PF04032": 803, "PF04039": 4, "PF04043": 155, "PF04051": 619, "PF04055": 25, "PF04060": 894, "PF04066": 479, "PF04085": 707, "PF04087": 516, "PF04092": 536, "PF04117": 194, "PF04122": 29, "PF04127": 568, "PF04145": 737, "PF04149": 957, "PF04166": 804, "PF04168": 662, "PF04172": 680, "PF04199": 779, "PF04205": 129, "PF04229": 243, "PF04234": 955, "PF04237": 34, "PF04241": 552, "PF04248": 444, "PF04253": 37, "PF04255": 375, "PF04264": 595, "PF04294": 554, "PF04295": 932, "PF04296": 119, "PF04298": 324, "PF04299": 614, "PF04304": 749, "PF04314": 545, "PF04316": 92, "PF04325": 722, "PF04327": 349, "PF04333": 873, "PF04366": 532, "PF04367": 127, "PF04371": 121, "PF04376": 733, "PF04377": 180, "PF04390": 203, "PF04402": 107, "PF04403": 706, "PF04408": 374, "PF04413": 266, "PF04427": 840, "PF04430": 320, "PF04461": 378, "PF04463": 626, "PF04468": 853, "PF04472": 672, "PF04515": 270, "PF04519": 86, "PF04536": 501, "PF04547": 942, "PF04551": 725, "PF04613": 817, "PF04675": 90, "PF04749": 848, "PF04777": 965, "PF04810": 211, "PF04815": 330, "PF04860": 452, "PF04892": 75, "PF04898": 969, "PF04960": 450, "PF04961": 306, "PF04963": 820, "PF04973": 165, "PF04982": 412, "PF05013": 1, "PF05025": 391, "PF05033": 790, "PF05161": 206, "PF05164": 981, "PF05173": 826, "PF05184": 882, "PF05191": 168, "PF05192": 987, "PF05195": 734, "PF05198": 315, "PF05201": 99, "PF05226": 138, "PF05235": 511, "PF05257": 188, "PF05336": 649, "PF05383": 87, "PF05402": 563, "PF05485": 282, "PF05494": 347, "PF05504": 585, "PF05524": 658, "PF05552": 174, "PF05598": 498, "PF05635": 621, "PF05649": 14, "PF05658": 824, "PF05681": 867, "PF05697": 274, "PF05792": 783, "PF05949": 441, "PF05960": 135, "PF06022": 776, "PF06026": 173, "PF06050": 517, "PF06071": 250, "PF06094": 15, "PF06130": 5, "PF06133": 702, "PF06165": 9, "PF06172": 248, "PF06201": 847, "PF06245": 141, "PF06415": 865, "PF06418": 830, "PF06421": 36, "PF06422": 345, "PF06429": 850, "PF06441": 410, "PF06477": 535, "PF06541": 504, "PF06580": 71, "PF06628": 758, "PF06686": 146, "PF06689": 911, "PF06719": 83, "PF06724": 77, "PF06750": 718, "PF06803": 919, "PF06912": 300, "PF06949": 728, "PF06961": 283, "PF06965": 57, "PF06968": 538, "PF07004": 340, "PF07075": 860, "PF07261": 487, "PF07264": 569, "PF07331": 876, "PF07332": 161, "PF07479": 503, "PF07486": 177, "PF07494": 778, "PF07497": 319, "PF07500": 895, "PF07501": 931, "PF07503": 19, "PF07516": 740, "PF07525": 789, "PF07538": 164, "PF07549": 618, "PF07556": 731, "PF07562": 959, "PF07593": 577, "PF07662": 963, "PF07664": 418, "PF07703": 169, "PF07715": 928, "PF07719": 629, "PF07735": 417, "PF07743": 811, "PF07804": 68, "PF07811": 277, "PF07873": 681, "PF07884": 43, "PF07907": 542, "PF07927": 284, "PF07963": 598, "PF07971": 247, "PF08029": 558, "PF08044": 427, "PF08148": 31, "PF08299": 70, "PF08310": 231, "PF08331": 757, "PF08338": 103, "PF08340": 972, "PF08345": 434, "PF08376": 812, "PF08378": 883, "PF08379": 970, "PF08393": 66, "PF08436": 454, "PF08439": 683, "PF08459": 794, "PF08486": 56, "PF08487": 362, "PF08494": 394, "PF08495": 659, "PF08497": 954, "PF08516": 531, "PF08522": 400, "PF08529": 785, "PF08542": 159, "PF08592": 795, "PF08669": 336, "PF08711": 551, "PF08712": 26, "PF08742": 925, "PF08766": 218, "PF08818": 949, "PF09118": 409, "PF09180": 998, "PF09269": 48, "PF09285": 518, "PF09298": 913, "PF09347": 661, "PF09362": 181, "PF09363": 459, "PF09365": 477, "PF09369": 967, "PF09371": 512, "PF09383": 408, "PF09397": 229, "PF09479": 705, "PF09527": 350, "PF09719": 352, "PF09723": 212, "PF09754": 634, "PF09827": 732, "PF09851": 991, "PF09989": 451, "PF09994": 960, "PF09995": 904, "PF10017": 303, "PF10035": 462, "PF10135": 252, "PF10143": 885, "PF10150": 223, "PF10369": 438, "PF10385": 421, "PF10396": 897, "PF10397": 791, "PF10410": 682, "PF10415": 628, "PF10431": 108, "PF10436": 35, "PF10442": 520, "PF10509": 537, "PF10518": 793, "PF10557": 525, "PF10576": 854, "PF10588": 869, "PF10589": 386, "PF10590": 584, "PF10601": 801, "PF10646": 494, "PF10728": 314, "PF10825": 600, "PF10996": 914, "PF11760": 23, "PF11799": 208, "PF11838": 765, "PF11915": 805, "PF11929": 404, "PF11975": 485, "PF11987": 660, "PF12002": 648, "PF12019": 884, "PF12344": 209, "PF12392": 784, "PF12396": 586, "PF12399": 94, "PF12464": 316, "PF12631": 800, "PF12661": 624, "PF12662": 958, "PF12673": 923, "PF12680": 597, "PF12686": 528, "PF12697": 575, "PF12704": 523, "PF12775": 219, "PF12781": 592, "PF12804": 254, "PF12806": 464, "PF12833": 703, "PF12874": 184, "PF12878": 521, "PF12911": 291, "PF12937": 420, "PF12951": 764, "PF13004": 292, "PF13012": 564, "PF13089": 593, "PF13167": 712, "PF13207": 133, "PF13234": 402, "PF13244": 988, "PF13277": 449, "PF13280": 351, "PF13288": 170, "PF13305": 265, "PF13335": 730, "PF13340": 272, "PF13356": 395, "PF13368": 632, "PF13376": 414, "PF13378": 739, "PF13382": 892, "PF13396": 377, "PF13399": 642, "PF13405": 636, "PF13408": 162, "PF13428": 679, "PF13439": 809, "PF13445": 199, "PF13449": 692, "PF13460": 829, "PF13472": 200, "PF13475": 183, "PF13478": 898, "PF13490": 573, "PF13493": 591, "PF13508": 443, "PF13517": 149, "PF13537": 150, "PF13545": 383, "PF13556": 541, "PF13559": 293, "PF13561": 171, "PF13565": 62, "PF13570": 244, "PF13597": 358, "PF13620": 952, "PF13634": 271, "PF13640": 771, "PF13649": 696, "PF13660": 741, "PF13676": 766, "PF13677": 828, "PF13682": 491, "PF13692": 915, "PF13720": 836, "PF13732": 111, "PF13742": 309, "PF13774": 390, "PF13793": 65, "PF13802": 307, "PF13850": 201, "PF13894": 832, "PF13920": 439, "PF13927": 460, "PF13932": 989, "PF13959": 363, "PF13975": 770, "PF13976": 872, "PF14008": 745, "PF14031": 346, "PF14226": 548, "PF14237": 631, "PF14278": 222, "PF14310": 673, "PF14319": 255, "PF14403": 60, "PF14416": 492, "PF14420": 968, "PF14450": 131, "PF14490": 372, "PF14497": 580, "PF14508": 81, "PF14510": 286, "PF14667": 945, "PF14693": 841, "PF14698": 709, "PF14714": 890, "PF14748": 670, "PF14791": 328, "PF14804": 900, "PF14805": 58, "PF14815": 380, "PF14841": 699, "PF14842": 639, "PF14905": 122, "PF15901": 89, "PF16016": 213, "PF16113": 943, "PF16123": 96, "PF16124": 59, "PF16177": 367, "PF16188": 526, "PF16189": 720, "PF16192": 694, "PF16193": 515, "PF16199": 153, "PF16212": 401, "PF16220": 326, "PF16320": 596, "PF16321": 858, "PF16325": 571, "PF16326": 281, "PF16344": 927, "PF16353": 74, "PF16355": 651, "PF16360": 995, "PF16486": 893, "PF16488": 228, "PF16491": 961, "PF16640": 999, "PF16653": 204, "PF16658": 654, "PF16859": 151, "PF16863": 143, "PF16870": 685, "PF16874": 946, "PF16875": 752, "PF16886": 539, "PF16901": 214, "PF16916": 880, "PF17042": 446, "PF17136": 845, "PF17137": 557, "PF17146": 665, "PF17202": 53, "PF17517": 956, "PF17676": 242, "PF17678": 298, "PF17755": 724, "PF17757": 47, "PF17759": 354, "PF17762": 76, "PF17763": 711, "PF17764": 185, "PF17768": 578, "PF17801": 716, "PF17802": 0, "PF17803": 263, "PF17827": 572, "PF17852": 333, "PF17853": 78, "PF17862": 944, "PF17871": 139, "PF17876": 426, "PF17910": 305, "PF17912": 105, "PF17919": 917, "PF17921": 10, "PF17940": 257, "PF17941": 620, "PF17957": 743, "PF17963": 852, "PF18052": 807, "PF18072": 674, "PF18074": 318, "PF18075": 3, "PF18076": 773, "PF18198": 466, "PF18199": 576, "PF18345": 843, "PF18741": 698, "PF18758": 780, "PF18759": 437, "PF18766": 456, "PF18803": 497, "PF18821": 2 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 514, "model_type": "bert", "num_attention_heads": 8, "num_hidden_layers": 2, "pad_token_id": 0, "position_embedding_type": "absolute", "problem_type": "single_label_classification", "torch_dtype": "float32", "transformers_version": "4.28.1", "type_vocab_size": 1, "use_cache": true, "vocab_size": 261 }