Protein Info for mRNA_6209 in Rhodosporidium toruloides IFO0880

Name: 14577
Annotation: K03006 RPB1, POLR2A DNA-directed RNA polymerase II subunit RPB1

These analyses and tools can help you predict a protein's function, but be skeptical. For enzymes, over 10% of annotations from KEGG or SEED are probably incorrect. For other types of proteins, the error rates may be much higher. MetaCyc and Swiss-Prot have low error rates, but the best hits in these databases are often quite distant, so this protein's function may not be the same. TIGRFam has low error rates. Finally, many experimentally-characterized proteins are not in any of these databases. To find relevant papers, use PaperBLAST.

Protein Families and Features

1 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1756 PF04997: RNA_pol_Rpb1_1" amino acids 17 to 353 (337 residues), 345.3 bits, see alignment E=1.5e-106 PF00623: RNA_pol_Rpb1_2" amino acids 355 to 518 (164 residues), 251 bits, see alignment E=2.8e-78 PF04983: RNA_pol_Rpb1_3" amino acids 522 to 681 (160 residues), 161.4 bits, see alignment E=7.1e-51 PF05000: RNA_pol_Rpb1_4" amino acids 707 to 812 (106 residues), 126.1 bits, see alignment 2.2e-40 PF04998: RNA_pol_Rpb1_5" amino acids 819 to 1402 (584 residues), 316.6 bits, see alignment E=5.6e-98 PF04992: RNA_pol_Rpb1_6" amino acids 875 to 1058 (184 residues), 218.4 bits, see alignment E=3.2e-68 PF04990: RNA_pol_Rpb1_7" amino acids 1143 to 1279 (137 residues), 171.4 bits, see alignment 4e-54 PF05001: RNA_pol_Rpb1_R" amino acids 1560 to 1572 (13 residues), 6.4 bits, see alignment (E = 0.0056) amino acids 1590 to 1607 (18 residues), 9.7 bits, see alignment (E = 0.00051) amino acids 1612 to 1625 (14 residues), 12.8 bits, see alignment (E = 5.1e-05) amino acids 1619 to 1632 (14 residues), 15.3 bits, see alignment (E = 8.8e-06) amino acids 1626 to 1639 (14 residues), 14.2 bits, see alignment (E = 1.8e-05) amino acids 1633 to 1646 (14 residues), 12.1 bits, see alignment (E = 8.5e-05) amino acids 1640 to 1653 (14 residues), 13.4 bits, see alignment (E = 3.5e-05) amino acids 1647 to 1660 (14 residues), 12.3 bits, see alignment (E = 7.5e-05) amino acids 1654 to 1667 (14 residues), 14.9 bits, see alignment (E = 1.1e-05) amino acids 1661 to 1674 (14 residues), 18.8 bits, see alignment (E = 6.5e-07) amino acids 1675 to 1688 (14 residues), 18.3 bits, see alignment (E = 9.3e-07) amino acids 1689 to 1702 (14 residues), 18.3 bits, see alignment (E = 9.3e-07) amino acids 1703 to 1716 (14 residues), 16.2 bits, see alignment (E = 4.5e-06) amino acids 1717 to 1729 (13 residues), 9.5 bits, see alignment (E = 0.00059)

Best Hits

Predicted SEED Role

"DNA-directed RNA polymerase II largest subunit (EC 2.7.7.6)" in subsystem RNA polymerase II (EC 2.7.7.6)

KEGG Metabolic Maps

Isozymes

Compare fitness of predicted isozymes for: 2.7.7.6

Use Curated BLAST to search for 2.7.7.6

Sequence Analysis Tools

PaperBLAST (search for papers about homologs of this protein)

Search CDD (the Conserved Domains Database, which includes COG and superfam)

Search structures

Predict protein localization: PSORTb (Gram-negative bacteria)

Predict transmembrane helices and signal peptides: Phobius

Check the current SEED with FIGfam search

Find homologs in fast.genomics or the ENIGMA genome browser

Find the best match in UniProt

Protein Sequence (1756 amino acids)

>mRNA_6209 K03006 RPB1, POLR2A DNA-directed RNA polymerase II subunit RPB1 (Rhodosporidium toruloides IFO0880)
MLGHEFAHSTQPIKRPRTIQFGVLSPEEIKAISVCKVEYPETFEEGNAKPKTGGLSDPRL
GTIDRNFKCATCGEGMAECPGHFGHIELSRAVYHVGFINKVKKITECICVQCGKLKVSPA
EDPRLADAVRFVRDPKKRLAVVHALVKGKNSCDMTIMDEETQAKIAAGEPVPPGHGGCGH
EQPQIRKEGLKLFLQYGKGKDEDGNAQAPDRRPFTATQAHTLFRKISDEDLHILGLSATE
AHPAWMILTVLPVPPPPVRPSIAVDGGAMRGEDDLTYKLAEIIRTNSSLRKFEEEGAPAH
VINEFETLLQWHIATYMDNDLPGQPQALQKSGRPVKSIRARLKGKEGRLRGNLMGKRVDF
SARTVITGDPNLALDEVGVPYSIARTLTYPERVTPYNIQDLQTLVRNGPNEYPGARYVVR
DTGDRIDLRYNKRADTFLQYGWIVERHLKDGDIVLFNRQPSLHKMSMMAHRVKLMPYSTF
RLNLSVTSPYNADFDGDEMNLHVPQSEETRSELINIAAVARNIISPQANKPVMGIVQDTL
CGIRKFTLRDCFMDLEFIQNILLWVPEWDGVIPPPAILKPKPLWTGKQILSMCIPAGINL
YRASASGTTNPPEDDGMFIEDGQVMFGTVDKKTVGTSQGGLIHVIYREKGHEVCRDWFSG
VQKVVNFWLLHNGFSIGIGDTVPDKGTMEAITGFIDTAKREVSNIIQQAQNDQLEPEPGM
TVRESFESKVTRALNQARDSSGRSAERSLKADNNVKQMVVAGSKGSFINISQMSACVGQQ
IVEGKRCPFGFKYRTLPHFAKDDYSPEARGFVENSYLRGLTPQEFFFHAMAGREGLIDTA
VKTAETGYIQRRLVKALEDVMVAYDGTFCYGEDGMDGAFIEEQVVEPIRLSDARFEKRYR
VDITDPTWHFKPGTLHVDLAPPENTDLQTVLDYEYEQLKRDRATLREVFPKGDPTVSLPV
NLRRVIQNAQQIFHIDFRNASDIPPADIVETVRDLCDRLVVIRGKDDLSAEAQNNATLLF
RIFLRSVLAVRRVIEEYHLNREAFEWVVGEVESKFNASLVAPGEMCGTLAAQSIGEPATQ
MTLNTFHYAGVSSKNVTLGVPRLKEIINVAVNLKTPSTTVYLDPDYAKDIQLAKEVQTRL
SYTTLQTVTASTEIFYDPDPTATVIEDDREFVEAFFAIPDEDVEANLHRQSPWLLRFELD
RAKMLDKKLEMHYVAGKIAETFEQDLFVIWSEDNAEKLVIRCRALKSADADKEGGDEDED
DEDVFLKQLESQMLGSIALSGVEGIDRVFMMEQKRPQLNAAGEFHTPNEWVLETDGINLR
KVLCTEGVDARRTISNSCVEMLEVLGIEAARASVLRELRNVIEFDGSKVNYRHLSMLTDI
MTNRGSLMAITRHGINRADTGALMRCSFEETVEILMEAAAAGETDYCTGVAEAVLLGQQP
PMGTGAFELSLDLNKLQEVIPDHRLPPPPNMQIDAVMDGGRTPGGGATPYADLSEGKTPA
YYGQQEQLGMAMFSPIIQPGQDGMYDQGGFGGASPFGGASPFGGASPFAGGQSPFSGGQS
PFGAAGYSPTSPFGGASPYNPRSPGPSGGASPWVNRSGGGGYSPTSPLIGQSPASPRFSP
ASPNYSPTSPAFSPASPRFSPASPAFSPASPRFSPASPAFSPTSPAYSPTSPKFSPTSPA
YSPTSPKFSPTSPAYSPTSPKFSPTSPAYSPTSPAFSPASPSYSPRSPGSGGADGANGSQ
QQRQNAPYSSSASWKR