MidgeBase gene description page [Pn.09808]

Outline

Link to gbrowse

Gene ID Pn.09808
Type Protein coding gene
Scaffold PnScaf10193
Start 5065
End 12430
Direction +

Sequence

Transcript: 4989 (bp)

 ATGGAAATGCCCGCGCTGTTGCGTCACCACGGAGTCAGAGGCGGCTACGGAGAACAAGGATTTCAAGGGTTGAGAGGTGACCCCGGCGAAGGAGGAATCAACTCGAAGGGAACCAAGGGAGACCGTGGCCGTGATGGTCTCGATGGCGCGCCAGGAAATCCCGGCTTCGACGGCCTTCCGGGCGAAAAGGGCTGGCCGGGTCTGGATGGGAAAAACGGCTATGACGGCGAGAAGGGCTCGAAGGGAGCGAAGGGTCTGCCTGGAGTGAAAACCATTTTTTGCAAGGGAGAGAAAGGTCAGCCTTTCCCAGAGGTTGAACTCATTCAGAGAAACTCGACCAAGATCATTCGAGGCGGTCCCGGCTTGAAAGGAACAAAAGGAGACGAAGGCTACGAGGGTCAGCTTGGAAGACATGGTGCTCGTGGCCAGCCCGGCGTAAGGGGTCTGAAGGGCTACAAAGGTGCCGAAGGAAACGAAGGCGACAGAGGCAAGCCCGGAAAAGTTGGCCCAATTGGAGCCAGCGGTGAGAAGGGAGAAAAGGGAGCACCAGGTTTCGCTGGTCGTGACGGAGTGAAGGGCAGCAAGGGAGATAAAGGAGATGACGGCTATGAGGGCATGCCCGGCACACAAGGACCCGCTGGACCACCTGGAAGATATGATCCCAATCTCGATGAAATCAGCGTCGGCCCAATTGGAAAGCAGGGTGAAACTGGCGAGGTCGGAGAAATGGGCGTTCAAGGTATTCCCGGTGAAGTTGGACGAAGAGGTCTGCCTGGTGATCAAGGACCACCCGGCGATCCCGGCACTGATGGCGAGCGTGGAAGACAAGGCTTCACAGAGAATGGCCAAGTGGGAGATGACGGCGAAGCAGGTCCAATGGGAAGTCACGGACAACCTGGATACGAAGGCGCTAGAGGACCAAAGGGTCAAAAAGGCTACCCAGGCGAGGATGTCTATGGACTGAAAGGAGAAATTGGAGAGCCTGGTCGTAATGGAGTTTCAGACGGACTGGCTGGACTTCGTGGTGAACCGGGTCTGCCTGGAGAAAAAGGTCTTCCCGGAATAGGTTTTAACATCACAGGACCTCCTGGACCCGACGGGTTGCCCGGAAAGCAAGGACCTCCTGGAAACCCAGGCTTGGATGGCTATGACGGTGAAAAGGGAGACAAAGGTTATAGAGGTGAAGACTGCGGTTTTTGTCTCGATGGTCTTCCAGGTTTGAAGGGAGACAGAGGAGAGTCTGGAAGAAGAGGCTTGCCCGGAAAAGAGGGAGGCCGAGGCCTTGTCGGTAGAAGAGGAGAGAAAGGAGCTAGAGGCAGAGATGGATTACCAGGTTATCCGGGTCTTCAAGGAACTGCGGGCCAGCCTGGTTTGCCTGGACCTCAGGGTGAAAAAGGTAAAAAAGGTCAAATTATCTACACGGGAAATAAGCCTAAACCGGAGTTGGGAGATCAAGGCGACAACGGCTATCCAGGCTTGCAAGGACCTGATGGCGATCAAGGTGAAGTTGGCGAGGATGGAAGAAATGGCGAGCCAGGAGATCGCGGCTACCCCGGAGTTGAAGGTCTGCCTGGTTTACCGGGTAAGAACGGCAAAGACGGACGTGACGGCAGAGACGGCTTACCGGGCGACGACGCAGAATGGTTCCAATATGGAGGTGAGCCGGGAGAACCCGGTTATCACGGTAAAAAGGGTGAGAGAGGCGACCGAGGAGATCAAGGCCAGAAAGGAGAGCCCGGCTATGTGCCAAAGCTCATACTTGATAAACGTGGTTTCAAGGGACTTCCGGGTCTGGACGGTAAGAAGGGTGAGCGCGGTGCAAGAGGCGATCAAGGAGACAAAGGTGAACGAGGCGAGGAGGGTGATGTTGGTCTGAGGGGCATTCCTGGCATAATCAGAGAAGGACCCGTTGGTGCGAAGGGTTATCCGGGCTTACCCGGAAACTATGGCCCTCGAGGTGAAGACGGTCGACCAGGTTTGCCAGGAGAAGAAGGTCTGCCCGGAATCCCTGGAATAAAAGGATCCAGAGGAGAGCCAGGCGCCGCAATTTTGTACGGAGAAATGGGTCCAGATGGTGATGAGGGTTATCCTGGCGAGAAGGGAGACAAGGGCTTCAGAGGTCAAATTGGACTGAGAGGAATTCCCGGCATTAGAGGAATTAAGGGCGAGAAGGGTGACATCGGTTTGGAAGGATTAATTGGTATCACCGGAAACAAAGGACAACGCGGTGACATCGTTTATGGAGATCGAGGCATCGAAGGTCTGCCTGGTCCCGACGGTAGAAATGGAATGTTCGGCCTGAAAGGTCATAAAGGAGATGAGGGTATTACTGGAGCCCCAGGACCTCAAGGCATAAGAGGAGAAGACGGTTTCGAAGGCCTGATTGGTCGCGATGGTGACGCAGGTGATGTTGGTGACACAGGTTCCTTGGGTATGATCGGAGTTATGGGAGATTTTGGTCTGCCTGGAGAAAAGGGAGATATTGGCAACTCTGGATTCCCCGGAAGAAGAGGCCCGATTGGTTTCATGGGACTGAAGGGTCTGCAAGGCGACAAAGGATTGCCTGGTTATCCCGGAAGGAATGGATTGAAGGGATGGAGAGGAGACTACGGTCCTCATGGTTATAAAGGACCAAAGGGAGAAGCCGCTTTTAGCGGACCAAAAGGAGAAGTTGGATTGCAAGGATTACAAGGTTTGCCCGGCTTGAATGGCTTGCCCGGAAGAAGAGGTTTAAAGGGCGAAGAAGGCGATGCTGGCAGAATCATTGATGGCAAACAGGGAATTAAGGGTTTGCCCGGCAAGAACGGAAGAGGAGGCAGAAGCGGAGCTAAAGGTCAAAGAGGAGACTCTGGCGCCCCTGGATTCAAAGGCATGAAAGGCGATGAAGGAAGAGCTGGATTCCCGGGACTTAAGGGTAGAGACGGCAGAAGGGGATACCCAGGATCCGCCGGACAAGTTGGACTAATGGGCGAAGTTGGACTGCCTGGTGACTTTGGTGAAATTGGATTTGGAATTAAAGGAGAAAGAGGCTTGATCGGCTTAATTGGATTACCCGGTTTGCAGGGAATGCCTGGCGATAAAGGCTTCTCTGGTCTGCCTGGTCCCGTCAGTCAAGGATGGGCAGAGAAGGGCGACCAAGGCGATGAAGGATTCGAAGGAGTTCAAGGCAGAAGAGGTGGAATGGGCCGTGTTGGCGAAAAGGGAGATATGGGACCAATCGGCGATTTTGGAATTGACGGAAGACCCGGTCTAAAAGGCCAGCAAGGTATAAAAGGTTTCAAAGGAGAACGTGGATTTGCTGCAGAATGGGCGGAGCCTGGTGATGAAGGTCTTGCTGGTTCTGATGGCTTCCCGGGACGACAAGGTATTCGAGGAAGCAAAGGAGCTCCGGGAGACTATGGACTCGATGGATTGCCGGGCATGGTTGGTGAACAAGGAACCAGCCCTGACGGAATTAAAGGAGTAGTAGGCGATATGGGTTTCCAGGGGGCAGTAGGTTTGCCCGGTCTGGATGGACTTCCTGGTTTGGAAGGTGACATCGGTTTGCCGGGAATCCGGGGAGAAATGGGCTTCTCTTCAGTCGGAATAAAGGGAGAGAGAGGAGACGTTGGTTTACCAGGTTTCGATGGCATCGATGGTCTTAACGGGGAGCCTGGAGATGAAGGAGAGCCGGGACCAATTGGCTACAGAGGAAGGAAAGGAGAAAGAGGCCCAATCGGCGATCCCGGTGACGAAGGACGTGATGGAATAGTCGGCTTGGGTGGCTTCAAGGGTCAAAAAGGAGAACCGTACCCAGCGAATTTGTCAAAGAGACCGATTTCAGGCTATGACGGATTGAAGGGTCAGAAGGGTGAGCAAGGTGATGTTGGTGAACCAGGTCTTCCTGGAAGACAGGGCTACAGAGGCTTGAAGGGTTACATCGGCTTGCAAGGTTTGACTGGCTTGCCTGGACCTCAGGGATACAAAGGCGAAAAAGGCTTCAGAGGAGCTCCCGGCCTGAACGGTTTAGATGGTTTCCGGGGTCCGCCAGGAGAAAATGGCGACGACGCCCCTCCACCGCCGCCCCCCAAAAGCCGAGGCTTCGTATTCACACGGCATTCCCAATCTATCGCTGTCCCGAGGTGCCCAATCAACACAAACCTCTTGTGGGAGGGCTACTCGTTTGTGTCGGTAATCGGCAGCGGACGTGCCGTCGGCCAAGACTTGGGTCAGTCGGGATCGTGCCTCCGTCAGTTCTCCACCATGCCGTTCATGTTCTGCAACCTGAATAACGTTTGCAGCTATGCAGAAAACAACGACGACAGCATCTGGCTCACGACCGGCGAGCCCATGCCGATGTCAATGACTCCCATTCCGGCCAGAGAGATGGAGAAGTATGTCTCGCGATGCGCTGTGTGCGAAACGACCACGCGCCTCATCGCGCTCCACAGTCAGAGCATGAGCATTCCGGACTGTCCGCAGGGTTGGGAGGAGGCCTGGATCGGCTACAGTTACTACATGCAAACTTCAGACGCCAGCGGAAACTCCCATCAGAACCTCATCTCGCCTGGCTCGTGCTTGGAGGAGTTCAGAGCGCAGCCAGTCATTGAGTGCCACGGCCGAGGCACATGCAACATCTTCGATGGTATCACATCGTTCTGGCTGACAGTCATCGAGGATAGCGAGCAGTTCAGAACGCCGAAGCAGCAAACGTTGAAGGCTGATCAAACTAGCAAAATTAGTCGATGTGCTGTCTGCCGCAAGATGGACAACAGCATCATAGCTCGCGCAGCTAATCGAGTGGTTCTGCCGAATGCGTCCGCCTTCTCCGCGCGCGAGACTGAGACGTATTCGTTTAGACAAAACAACGACGTTTCGGCCATCAATCAGCCAGAGTACACTCTAGTTCAACCGCAGCCGCCACGTGGCCCTCCGCCTCCGCGACGAAGACGGCCAGGTGCGCGAACGAATCCGCGACGACTGAATCGCTCGCGAGACCAGCAGCAAGGC 

Protein: 1663 (aa)

 MEMPALLRHHGVRGGYGEQGFQGLRGDPGEGGINSKGTKGDRGRDGLDGAPGNPGFDGLPGEKGWPGLDGKNGYDGEKGSKGAKGLPGVKTIFCKGEKGQPFPEVELIQRNSTKIIRGGPGLKGTKGDEGYEGQLGRHGARGQPGVRGLKGYKGAEGNEGDRGKPGKVGPIGASGEKGEKGAPGFAGRDGVKGSKGDKGDDGYEGMPGTQGPAGPPGRYDPNLDEISVGPIGKQGETGEVGEMGVQGIPGEVGRRGLPGDQGPPGDPGTDGERGRQGFTENGQVGDDGEAGPMGSHGQPGYEGARGPKGQKGYPGEDVYGLKGEIGEPGRNGVSDGLAGLRGEPGLPGEKGLPGIGFNITGPPGPDGLPGKQGPPGNPGLDGYDGEKGDKGYRGEDCGFCLDGLPGLKGDRGESGRRGLPGKEGGRGLVGRRGEKGARGRDGLPGYPGLQGTAGQPGLPGPQGEKGKKGQIIYTGNKPKPELGDQGDNGYPGLQGPDGDQGEVGEDGRNGEPGDRGYPGVEGLPGLPGKNGKDGRDGRDGLPGDDAEWFQYGGEPGEPGYHGKKGERGDRGDQGQKGEPGYVPKLILDKRGFKGLPGLDGKKGERGARGDQGDKGERGEEGDVGLRGIPGIIREGPVGAKGYPGLPGNYGPRGEDGRPGLPGEEGLPGIPGIKGSRGEPGAAILYGEMGPDGDEGYPGEKGDKGFRGQIGLRGIPGIRGIKGEKGDIGLEGLIGITGNKGQRGDIVYGDRGIEGLPGPDGRNGMFGLKGHKGDEGITGAPGPQGIRGEDGFEGLIGRDGDAGDVGDTGSLGMIGVMGDFGLPGEKGDIGNSGFPGRRGPIGFMGLKGLQGDKGLPGYPGRNGLKGWRGDYGPHGYKGPKGEAAFSGPKGEVGLQGLQGLPGLNGLPGRRGLKGEEGDAGRIIDGKQGIKGLPGKNGRGGRSGAKGQRGDSGAPGFKGMKGDEGRAGFPGLKGRDGRRGYPGSAGQVGLMGEVGLPGDFGEIGFGIKGERGLIGLIGLPGLQGMPGDKGFSGLPGPVSQGWAEKGDQGDEGFEGVQGRRGGMGRVGEKGDMGPIGDFGIDGRPGLKGQQGIKGFKGERGFAAEWAEPGDEGLAGSDGFPGRQGIRGSKGAPGDYGLDGLPGMVGEQGTSPDGIKGVVGDMGFQGAVGLPGLDGLPGLEGDIGLPGIRGEMGFSSVGIKGERGDVGLPGFDGIDGLNGEPGDEGEPGPIGYRGRKGERGPIGDPGDEGRDGIVGLGGFKGQKGEPYPANLSKRPISGYDGLKGQKGEQGDVGEPGLPGRQGYRGLKGYIGLQGLTGLPGPQGYKGEKGFRGAPGLNGLDGFRGPPGENGDDAPPPPPPKSRGFVFTRHSQSIAVPRCPINTNLLWEGYSFVSVIGSGRAVGQDLGQSGSCLRQFSTMPFMFCNLNNVCSYAENNDDSIWLTTGEPMPMSMTPIPAREMEKYVSRCAVCETTTRLIALHSQSMSIPDCPQGWEEAWIGYSYYMQTSDASGNSHQNLISPGSCLEEFRAQPVIECHGRGTCNIFDGITSFWLTVIEDSEQFRTPKQQTLKADQTSKISRCAVCRKMDNSIIARAANRVVLPNASAFSARETETYSFRQNNDVSAINQPEYTLVQPQPPRGPPPPRRRRPGARTNPRRLNRSRDQQQG 
Type Start End Length
CDS 5065 5086 22
CDS 5136 5249 114
CDS 5324 5629 306
CDS 5689 5877 189
CDS 5944 6095 152
CDS 6163 6278 116
CDS 6342 6396 55
CDS 6461 6539 79
CDS 6597 6654 58
CDS 6715 6739 25
CDS 6803 6910 108
CDS 6983 7091 109
CDS 7153 7171 19
CDS 7234 7355 122
CDS 7415 7450 36
CDS 7521 7574 54
CDS 7637 7763 127
CDS 7825 8153 329
CDS 8215 8313 99
CDS 8377 8452 76
CDS 8516 8561 46
CDS 8624 8732 109
CDS 8796 8930 135
CDS 8987 9067 81
CDS 9136 9199 64
CDS 9267 9364 98
CDS 9421 9605 185
CDS 9663 9725 63
CDS 9816 9955 140
CDS 10131 10232 102
CDS 10292 10401 110
CDS 10466 10633 168
CDS 10695 10817 123
CDS 10875 10964 90
CDS 11024 11227 204
CDS 11290 11352 63
CDS 11415 12427 1013
intron 5087 5135 49
intron 5250 5323 74
intron 5630 5688 59
intron 5878 5943 66
intron 6096 6162 67
intron 6279 6341 63
intron 6397 6460 64
intron 6540 6596 57
intron 6655 6714 60
intron 6740 6802 63
intron 6911 6982 72
intron 7092 7152 61
intron 7172 7233 62
intron 7356 7414 59
intron 7451 7520 70
intron 7575 7636 62
intron 7764 7824 61
intron 8154 8214 61
intron 8314 8376 63
intron 8453 8515 63
intron 8562 8623 62
intron 8733 8795 63
intron 8931 8986 56
intron 9068 9135 68
intron 9200 9266 67
intron 9365 9420 56
intron 9606 9662 57
intron 9726 9815 90
intron 9956 10130 175
intron 10233 10291 59
intron 10402 10465 64
intron 10634 10694 61
intron 10818 10874 57
intron 10965 11023 59
intron 11228 11289 62
intron 11353 11414 62

Auto annotation result

Program/Analysis Accession Description Score/Expectation
BLASTP/NCBI-nr XP_001846673 collagen alpha-2(IV) chain [Culex quinquefasciatus] gb|EDS44340.1| collagen alpha-2(IV) chain [Culex quinquefasciatus] 0.0
InterPro IPR016187 C-type lectin fold
InterPro IPR001442 Collagen IV, non-collagenous
InterPro IPR008160 Collagen triple helix repeat
Gene Ontology(CC) GO:0005581 collagen
Gene Ontology(MF) GO:0005201 extracellular matrix structural constituent
Pfam PF01391.13 Collagen triple helix repeat (20 copies) 1.6e-171
Pfam PF01413.14 C-terminal tandem repeated domain in type 4 procollagen 5.4e-80

Expression level (RPKM)

Paralog/Ortholog genes

Paralogous genes

Gene ID
Pn.12591

Orthologous genes

Species Gene ID
H. sapiens ENSP00000380382
D. melanogaster FBgn0016075
P. vanderplanki Pv.17128
P. humanus PHUM136040-PA
A. mellifera GB14564-PA
B. mori BGIBMGA014039-TA
H. melpomene HMEL002288-PA
T. castaneum TC013472
S. invicta SI2.2.0_02661
H. sapiens ENSP00000445236
H. sapiens ENSP00000353654
H. sapiens ENSP00000443707
N. vitripennis NV11106-PA
M. musculus ENSMUSG00000031273
D. plexippus DPOGS206535PA
H. melpomene HMEL002285-PA
A. gambiae AGAP009200
P. vanderplanki Pv.11194
B. mori BGIBMGA014040-TA
D. plexippus DPOGS206549PA
H. sapiens ENSP00000378340
D. melanogaster FBgn0000299
C. quinquefasciatus CPIJ802363
M. musculus ENSMUSG00000031502
H. sapiens ENSP00000334733
H. sapiens ENSP00000443348
S. invicta SI2.2.0_09523
A. mellifera GB13353-PA
P. humanus PHUM136030-PA
H. sapiens ENSP00000364979
A. gambiae AGAP009201
H. sapiens ENSP00000361290
P. vanderplanki Pv.17126
T. castaneum TC014326
C. quinquefasciatus CPIJ802355