MidgeBase gene description page [Pn.04271]

Outline

Link to gbrowse

Gene ID Pn.04271
Type Protein coding gene
Scaffold PnScaf3491
Start 28434
End 34509
Direction +

Sequence

Transcript: 4230 (bp)

 ATGAAGTTGTGGAACTTTCTGATACTTTTGGTGTCTCTCAATCGGATTAAAAACGCATCGGCGGAGCTTTACGATGATCAACCCAAAATCCTCGTCAACGACGGCAACTTGGTGTTCGAGTCAGCCGACTCGAAGAACATAACGGTGAAGCTGAAGGGCCGCAGTCGCTTTTTAGTCAATGACCTCGACGTGCTCACGATGCTCTCAAATGTCTCTGGCACTGCTTCTAGTGGCAATCCAGGGCAACTCCGTCCCGGCCTTGTTGCAAGCATTCTTCAGCAGATCACAACGCTCCGCGACACCGTCCACCGCCAGAGCCAGCGAATCGACGAGCTCGAGCTCGGCGCAAGCAATGGCAACGCAAACAGCGGAAACAACGGCACAAACACGACGCTCAGCGATCGACAGCGAATCAACAGAATCGGCCGCCGGGTGACTGCTCTCGAGCAGCGCCTTGCCAATCTTACCGAACGACTCACGAAGGACCACTGCAAGAGCTATCCGTGCCAGAACGGCGGCACCTGCTTCAATCTCTTCGACACGTACCGCTGTGAGTGCCCCGAAAATTACGAGGGCCCTCAGTGCACGCTCGACGTCAACGAGTGCGCCCGCTATGTGGGCACCGACCTCGGATGCCAGAACGGAGCGACTTGTGTGAACACCTTCGGCGCTTACGAGTGCACATGTACCGAGGGATGGCACGGCAAGAACTGCAACCGAAGGAAGGTCGACTGTCTCTCGGTGCCGCAGTCGGAGATTTGCGGGCAAGGCATTTGTGTTACCGACAACAGCAAACGCGGCTATTCGTGTATATGCAATCAAGGATGGCGATGGGACAACGGCACTCAGGCGTGCACACAAGACGTCAACGAGTGCGAGGAAATGCGGCCACACTGTTCGGTCGATCCCAAAGTTGTGTGCATCAATACGCCCGGCTCGTTTGTGTGTGGTCCGTGCCCGAGCGGTTATCGAGGAAATGGCTTCTATTGTGAGGACGTCGACGAGTGTGAAGTGAATAATGGCGGCTGCAGTACATCGCCCAGAGTCGAGTGCATCAATACGCGAGGCTCCCACCGATGCGGTCAATGTCCGCTCGGCTACGAGGGCGACGGCAGAATTTGCATAGCACGACAGCCGAATGCACTGAATCAGTGCGACGACAGCAGCATTTGTAATTCTAATGCCGTTTGCTATCAATATCCAAACTCGCCGCCACAGTGCACATGCAAGTACGGATTCACGGGCAACGGCTTCGGCGACAGCGGCTGCATTCCCATCGCTGCTGATCCATGCGTCGCCGTTCGCTGTCGCAACGGAGGCACCTGCGTGCGAAACGGAACGACAGCCTACTGCACTTGCCCGCCGGGCACGAATCCACCTCTGTGCGACCGAACACTCAATAGCTGCGATCCGAATCCTTGCAGAAACGGCGGCAATTGCACGAACACCTTCCGCTTGAGTTTTCGCTGTACGTGCCCGCGCGGCTTCACGGGCCTTCGATGCGAAAATCAGCTGGCCACTTGCGGAGGCGTATTGTCGGACGAGCGAGGCACGCTGCGCTATCCAACAGACCCGGCTGCGGCCTCCTATCAGCACAACTCGCGCTGTGCCTGGCTCATCCGAACGAACATCACAAAAGTATTGAATATCACGTTTACGTCATTCGACGTCGAGTTCTCGAACGAGTGCCGATACGACTGGCTCCAAATACATGACGGACGGACGGCAGCGTCGCACATCATAGGCAGATTTTGCGGGTCGCAGCTGCCGAAAGGCGGCAACATCATTTCCACGCACAACTCACTCTATTTGTGGTTCCGATCGGATAACAGCACATCGCACTCTGGCTTTGAGCTCACGTGGGAGACGATAGATCCAGTGTGCGGTGGCGAGCAAAACATTGTGTCGCACGGAACAATCGCATCGCCCGGCTCGCCGGGAAATTATCCCATAAATAGCGAATGCGAGTGGATTCTACTCGCACCGGCCGGCAAGCGTATTCAATTTCTCTTCTACACGCTGATGATTGAGGCGCACACCACGTGCGGTTACGATTACCTCGAAATCCACAGCGGCATCGGAACGTCGTCGCCGAGCTTAGGGAAATTCTGCAACTCGTCGATTCCAGCGCCGCTTCTAACGCCTGGCAACGTTGCGACGATTCATTTTCACACCGACGGTGACTCGACCGATGCGGGCTTTCAAATCGCCTACTCGGTCGTCGAGGGAATTCCCGGCTGCGGTGGCGTTTATACGGCTCCGAAAGGCGATATCTCGTCGCCGACAAACATCGTCGATGGAACCTACAAGCACAATTTGATGTGCGACTATGTGATCCGGATGCCGAGCAATTCGAGAGTGCGCCTTGAGTTTAAAAAATTCGGACTGGAGGAGAGCTCGAGCTGCAAGTTTGACTCCCTCGAGATCTTCGAAGGCGACGAGGGCAACGAGGAGGGACTGATAGGTCGCTACTGCGGCACTACAACACCTCCAACCATAACATCGAGCACCAACGTTGTCACGCTGAAATTCACGACAGATTGGTCGACGAGCGACATTGGCTTCGAGCTGCAGTACCAGCTAATCTGCGGCGGAATATTTACAGCGGATGAGGGCGTATTTTCGAGTCCGAACTATCCGAACAACTACGACGCTGACCTTCTCTGCGAGTACGACATTATTGCGCCGCAAGGCAAGGTGATTTTCCTGAACATCCTCGACTTTGAGATCGAGCAGCACTCGTCGTGCGAGTTCGATTATCTGCAAATTTTCGACAGCTCATCGGCCGAAAACTCGACGAGTCTGGGCCGATATTGCGGCGACATCAGGCCCGGCACCTTTACGTCCAGCTTCAATCACATTCACCTGCAGTTTGCTTCAGATGGGTCGGTGTTTGGCACCGGTTTTCAGGCCAACTACACGTTCGTGGATGTCAGGTGTGGCGGGTTGATTAAAGACGCCAAAGAGCTCGTGAAGGCGCCCCTCGACCAGAACAACAACGGCGTCTACGAGTCGAACGCCCTCTGCAAGTGGCTGGTGGTCGCACCCAAGGGCCACGTCATTCAAATGAACATCCTGAACTTCGAACTGGAGCTCGACAACTCGTGCAAGTACGACTACTTGACCATCTACAACAATGGCTCGGGCAATGGCGGACAGGTGGGGCCGTTTTGCGGCACAAACATCCCGAAAGTCATCACGACCGTCGACAACATTGCGACCATTGTTTTCGTTTCCGACTCTTCCACATCAAAGGACGGCTTCACAATCAGCTTCAATTTCATCGATGGCACTAAACTGTGCGGCGGTAACTTCCACTCGCTGCACGGAAAGATAAAAAGTCCGGGCACCGGAATGTACTTGCCGAACAAGGAGTGTGAGTGGACCATCACCGTGCCGCACGGACAGCAGATTGAGGTGAACTTCAAGTTCTTCGACATCGAGAATCACAGCGCATGCCGCTTCGATGGGCTCGAAATCCGCAACGGCGGTAACAGTCTGGCGCCATTGCTGAGCAAGATTTGCGGCAGCACCGTGCCGCCGCCGTTCCGATCGATGGGCAATCAGCTGTACTTCCGGTTCTACTCGGACTCGTCGCGCAGCGGCACGGGCTTCGAGCTCGAATGGGACGGAACGAGTGCGGGCTGCGGCGGCATTCTAACCACAGCCAAGGGCGCGCTCATCTCGCCCAACTACCCGCTCAATTACCCGCGCAATTCGCAGTGCGAGTGGCGCATCACCGTCAACGAAGGCTCCTCCATCCACATCGTCTTCTCGGACCTCGACCTGGAGGCTAACTCCGAATGCCGCTACGACTACCTCGAGATCTTCGACGGGCCCGACCCGAGCGCGCGCAGCTTCGGCAAGTTCTGCGAGGAGCACCCGATGCACATCGAGACGAGCAGCAACCACGCGATGCTGCGCATGAACACCGACGAGTCGCACTCGGGCAGGGGCTTTCACGTGAAGTACTCGACGAATTGCAACCGCACGATCGAGGCGGACAGTGGCGTCATCGAGTCGCCGAACTTCCCCGAAGATTACCCGAGCAACCTCGACTGCGCGTGGACGATCAAGGTGTCGCGCGGCAACAAGGTGAACCTGCAGTTCTCGCACTTCTCCATCGAGAACGACAACCTCTACCACAACGAGACGGGCGGACACATCTGCAAGCGGCCGTCGGAACCGGCTTTCGTCTCGAGTGGTACAACGAGGGCTGCGGCGGAAAGC 

Protein: 1410 (aa)

 MKLWNFLILLVSLNRIKNASAELYDDQPKILVNDGNLVFESADSKNITVKLKGRSRFLVNDLDVLTMLSNVSGTASSGNPGQLRPGLVASILQQITTLRDTVHRQSQRIDELELGASNGNANSGNNGTNTTLSDRQRINRIGRRVTALEQRLANLTERLTKDHCKSYPCQNGGTCFNLFDTYRCECPENYEGPQCTLDVNECARYVGTDLGCQNGATCVNTFGAYECTCTEGWHGKNCNRRKVDCLSVPQSEICGQGICVTDNSKRGYSCICNQGWRWDNGTQACTQDVNECEEMRPHCSVDPKVVCINTPGSFVCGPCPSGYRGNGFYCEDVDECEVNNGGCSTSPRVECINTRGSHRCGQCPLGYEGDGRICIARQPNALNQCDDSSICNSNAVCYQYPNSPPQCTCKYGFTGNGFGDSGCIPIAADPCVAVRCRNGGTCVRNGTTAYCTCPPGTNPPLCDRTLNSCDPNPCRNGGNCTNTFRLSFRCTCPRGFTGLRCENQLATCGGVLSDERGTLRYPTDPAAASYQHNSRCAWLIRTNITKVLNITFTSFDVEFSNECRYDWLQIHDGRTAASHIIGRFCGSQLPKGGNIISTHNSLYLWFRSDNSTSHSGFELTWETIDPVCGGEQNIVSHGTIASPGSPGNYPINSECEWILLAPAGKRIQFLFYTLMIEAHTTCGYDYLEIHSGIGTSSPSLGKFCNSSIPAPLLTPGNVATIHFHTDGDSTDAGFQIAYSVVEGIPGCGGVYTAPKGDISSPTNIVDGTYKHNLMCDYVIRMPSNSRVRLEFKKFGLEESSSCKFDSLEIFEGDEGNEEGLIGRYCGTTTPPTITSSTNVVTLKFTTDWSTSDIGFELQYQLICGGIFTADEGVFSSPNYPNNYDADLLCEYDIIAPQGKVIFLNILDFEIEQHSSCEFDYLQIFDSSSAENSTSLGRYCGDIRPGTFTSSFNHIHLQFASDGSVFGTGFQANYTFVDVRCGGLIKDAKELVKAPLDQNNNGVYESNALCKWLVVAPKGHVIQMNILNFELELDNSCKYDYLTIYNNGSGNGGQVGPFCGTNIPKVITTVDNIATIVFVSDSSTSKDGFTISFNFIDGTKLCGGNFHSLHGKIKSPGTGMYLPNKECEWTITVPHGQQIEVNFKFFDIENHSACRFDGLEIRNGGNSLAPLLSKICGSTVPPPFRSMGNQLYFRFYSDSSRSGTGFELEWDGTSAGCGGILTTAKGALISPNYPLNYPRNSQCEWRITVNEGSSIHIVFSDLDLEANSECRYDYLEIFDGPDPSARSFGKFCEEHPMHIETSSNHAMLRMNTDESHSGRGFHVKYSTNCNRTIEADSGVIESPNFPEDYPSNLDCAWTIKVSRGNKVNLQFSHFSIENDNLYHNETGGHICKRPSEPAFVSSGTTRAAAES 
Type Start End Length
CDS 28434 28513 80
CDS 28600 29196 597
CDS 29282 29554 273
CDS 29636 29761 126
CDS 30182 30812 631
CDS 30902 31073 172
CDS 31411 31955 545
CDS 32153 32312 160
CDS 32382 33095 714
CDS 33361 33559 199
CDS 33632 34306 675
CDS 34449 34506 58
intron 28514 28599 86
intron 29197 29281 85
intron 29555 29635 81
intron 29762 30181 420
intron 30813 30901 89
intron 31074 31410 337
intron 31956 32152 197
intron 32313 32381 69
intron 33096 33360 265
intron 33560 33631 72
intron 34307 34448 142

Auto annotation result

Program/Analysis Accession Description Score/Expectation
BLASTP/NCBI-nr XP_001843996 cubilin [Culex quinquefasciatus] gb|EDS35495.1| cubilin [Culex quinquefasciatus] 0.0
InterPro IPR000859 CUB
InterPro IPR018097 EGF-like calcium-binding, conserved site
InterPro IPR000152 EGF-type aspartate/asparagine hydroxylation site
InterPro IPR006209 EGF-like domain
InterPro IPR006210 Epidermal growth factor-like
InterPro IPR001881 EGF-like calcium-binding
InterPro IPR000742 Epidermal growth factor-like domain
InterPro IPR013032 EGF-like, conserved site
Gene Ontology(MF) GO:0005515 protein binding
Gene Ontology(MF) GO:0005509 calcium ion binding
Pfam PF12662.2 Complement Clr-like EGF-like 0.002
Pfam PF07974.8 EGF-like domain 0.0017
Pfam PF12947.2 EGF domain 2.6e-09
Pfam PF12661.2 Human growth factor-like EGF 0.0003
Pfam PF07645.10 Calcium-binding EGF domain 9.9e-18
Pfam PF02408.15 CUB-like domain 6.8e-15
Pfam PF00008.22 EGF-like domain 1.4e-22
Pfam PF00431.15 CUB domain 6.5e-210

Expression level (RPKM)

Paralog/Ortholog genes

Paralogous genes

Gene ID
Pn.08910

Orthologous genes

Species Gene ID
S. invicta SI2.2.0_03406
P. vanderplanki Pv.14800
N. vitripennis NV13464-PA
T. castaneum TC007013
A. mellifera GB17517-PA
D. plexippus DPOGS207300PA
H. sapiens ENSP00000367064
B. mori BGIBMGA014545-TA
A. aegypti AAEL010965
D. melanogaster FBgn0052702
P. humanus PHUM104920-PA
P. vanderplanki Pv.12269
C. quinquefasciatus CPIJ002327
A. gambiae AGAP005526
D. melanogaster FBgn0259140
A. aegypti AAEL014312
M. musculus ENSMUSG00000026726