MidgeBase gene description page [Pn.09808]
Outline
Gene ID | Pn.09808 |
Type | Protein coding gene |
Scaffold | PnScaf10193 |
Start | 5065 |
End | 12430 |
Direction | + |
Sequence
Transcript: 4989 (bp)
ATGGAAATGCCCGCGCTGTTGCGTCACCACGGAGTCAGAGGCGGCTACGGAGAACAAGGATTTCAAGGGTTGAGAGGTGACCCCGGCGAAGGAGGAATCAACTCGAAGGGAACCAAGGGAGACCGTGGCCGTGATGGTCTCGATGGCGCGCCAGGAAATCCCGGCTTCGACGGCCTTCCGGGCGAAAAGGGCTGGCCGGGTCTGGATGGGAAAAACGGCTATGACGGCGAGAAGGGCTCGAAGGGAGCGAAGGGTCTGCCTGGAGTGAAAACCATTTTTTGCAAGGGAGAGAAAGGTCAGCCTTTCCCAGAGGTTGAACTCATTCAGAGAAACTCGACCAAGATCATTCGAGGCGGTCCCGGCTTGAAAGGAACAAAAGGAGACGAAGGCTACGAGGGTCAGCTTGGAAGACATGGTGCTCGTGGCCAGCCCGGCGTAAGGGGTCTGAAGGGCTACAAAGGTGCCGAAGGAAACGAAGGCGACAGAGGCAAGCCCGGAAAAGTTGGCCCAATTGGAGCCAGCGGTGAGAAGGGAGAAAAGGGAGCACCAGGTTTCGCTGGTCGTGACGGAGTGAAGGGCAGCAAGGGAGATAAAGGAGATGACGGCTATGAGGGCATGCCCGGCACACAAGGACCCGCTGGACCACCTGGAAGATATGATCCCAATCTCGATGAAATCAGCGTCGGCCCAATTGGAAAGCAGGGTGAAACTGGCGAGGTCGGAGAAATGGGCGTTCAAGGTATTCCCGGTGAAGTTGGACGAAGAGGTCTGCCTGGTGATCAAGGACCACCCGGCGATCCCGGCACTGATGGCGAGCGTGGAAGACAAGGCTTCACAGAGAATGGCCAAGTGGGAGATGACGGCGAAGCAGGTCCAATGGGAAGTCACGGACAACCTGGATACGAAGGCGCTAGAGGACCAAAGGGTCAAAAAGGCTACCCAGGCGAGGATGTCTATGGACTGAAAGGAGAAATTGGAGAGCCTGGTCGTAATGGAGTTTCAGACGGACTGGCTGGACTTCGTGGTGAACCGGGTCTGCCTGGAGAAAAAGGTCTTCCCGGAATAGGTTTTAACATCACAGGACCTCCTGGACCCGACGGGTTGCCCGGAAAGCAAGGACCTCCTGGAAACCCAGGCTTGGATGGCTATGACGGTGAAAAGGGAGACAAAGGTTATAGAGGTGAAGACTGCGGTTTTTGTCTCGATGGTCTTCCAGGTTTGAAGGGAGACAGAGGAGAGTCTGGAAGAAGAGGCTTGCCCGGAAAAGAGGGAGGCCGAGGCCTTGTCGGTAGAAGAGGAGAGAAAGGAGCTAGAGGCAGAGATGGATTACCAGGTTATCCGGGTCTTCAAGGAACTGCGGGCCAGCCTGGTTTGCCTGGACCTCAGGGTGAAAAAGGTAAAAAAGGTCAAATTATCTACACGGGAAATAAGCCTAAACCGGAGTTGGGAGATCAAGGCGACAACGGCTATCCAGGCTTGCAAGGACCTGATGGCGATCAAGGTGAAGTTGGCGAGGATGGAAGAAATGGCGAGCCAGGAGATCGCGGCTACCCCGGAGTTGAAGGTCTGCCTGGTTTACCGGGTAAGAACGGCAAAGACGGACGTGACGGCAGAGACGGCTTACCGGGCGACGACGCAGAATGGTTCCAATATGGAGGTGAGCCGGGAGAACCCGGTTATCACGGTAAAAAGGGTGAGAGAGGCGACCGAGGAGATCAAGGCCAGAAAGGAGAGCCCGGCTATGTGCCAAAGCTCATACTTGATAAACGTGGTTTCAAGGGACTTCCGGGTCTGGACGGTAAGAAGGGTGAGCGCGGTGCAAGAGGCGATCAAGGAGACAAAGGTGAACGAGGCGAGGAGGGTGATGTTGGTCTGAGGGGCATTCCTGGCATAATCAGAGAAGGACCCGTTGGTGCGAAGGGTTATCCGGGCTTACCCGGAAACTATGGCCCTCGAGGTGAAGACGGTCGACCAGGTTTGCCAGGAGAAGAAGGTCTGCCCGGAATCCCTGGAATAAAAGGATCCAGAGGAGAGCCAGGCGCCGCAATTTTGTACGGAGAAATGGGTCCAGATGGTGATGAGGGTTATCCTGGCGAGAAGGGAGACAAGGGCTTCAGAGGTCAAATTGGACTGAGAGGAATTCCCGGCATTAGAGGAATTAAGGGCGAGAAGGGTGACATCGGTTTGGAAGGATTAATTGGTATCACCGGAAACAAAGGACAACGCGGTGACATCGTTTATGGAGATCGAGGCATCGAAGGTCTGCCTGGTCCCGACGGTAGAAATGGAATGTTCGGCCTGAAAGGTCATAAAGGAGATGAGGGTATTACTGGAGCCCCAGGACCTCAAGGCATAAGAGGAGAAGACGGTTTCGAAGGCCTGATTGGTCGCGATGGTGACGCAGGTGATGTTGGTGACACAGGTTCCTTGGGTATGATCGGAGTTATGGGAGATTTTGGTCTGCCTGGAGAAAAGGGAGATATTGGCAACTCTGGATTCCCCGGAAGAAGAGGCCCGATTGGTTTCATGGGACTGAAGGGTCTGCAAGGCGACAAAGGATTGCCTGGTTATCCCGGAAGGAATGGATTGAAGGGATGGAGAGGAGACTACGGTCCTCATGGTTATAAAGGACCAAAGGGAGAAGCCGCTTTTAGCGGACCAAAAGGAGAAGTTGGATTGCAAGGATTACAAGGTTTGCCCGGCTTGAATGGCTTGCCCGGAAGAAGAGGTTTAAAGGGCGAAGAAGGCGATGCTGGCAGAATCATTGATGGCAAACAGGGAATTAAGGGTTTGCCCGGCAAGAACGGAAGAGGAGGCAGAAGCGGAGCTAAAGGTCAAAGAGGAGACTCTGGCGCCCCTGGATTCAAAGGCATGAAAGGCGATGAAGGAAGAGCTGGATTCCCGGGACTTAAGGGTAGAGACGGCAGAAGGGGATACCCAGGATCCGCCGGACAAGTTGGACTAATGGGCGAAGTTGGACTGCCTGGTGACTTTGGTGAAATTGGATTTGGAATTAAAGGAGAAAGAGGCTTGATCGGCTTAATTGGATTACCCGGTTTGCAGGGAATGCCTGGCGATAAAGGCTTCTCTGGTCTGCCTGGTCCCGTCAGTCAAGGATGGGCAGAGAAGGGCGACCAAGGCGATGAAGGATTCGAAGGAGTTCAAGGCAGAAGAGGTGGAATGGGCCGTGTTGGCGAAAAGGGAGATATGGGACCAATCGGCGATTTTGGAATTGACGGAAGACCCGGTCTAAAAGGCCAGCAAGGTATAAAAGGTTTCAAAGGAGAACGTGGATTTGCTGCAGAATGGGCGGAGCCTGGTGATGAAGGTCTTGCTGGTTCTGATGGCTTCCCGGGACGACAAGGTATTCGAGGAAGCAAAGGAGCTCCGGGAGACTATGGACTCGATGGATTGCCGGGCATGGTTGGTGAACAAGGAACCAGCCCTGACGGAATTAAAGGAGTAGTAGGCGATATGGGTTTCCAGGGGGCAGTAGGTTTGCCCGGTCTGGATGGACTTCCTGGTTTGGAAGGTGACATCGGTTTGCCGGGAATCCGGGGAGAAATGGGCTTCTCTTCAGTCGGAATAAAGGGAGAGAGAGGAGACGTTGGTTTACCAGGTTTCGATGGCATCGATGGTCTTAACGGGGAGCCTGGAGATGAAGGAGAGCCGGGACCAATTGGCTACAGAGGAAGGAAAGGAGAAAGAGGCCCAATCGGCGATCCCGGTGACGAAGGACGTGATGGAATAGTCGGCTTGGGTGGCTTCAAGGGTCAAAAAGGAGAACCGTACCCAGCGAATTTGTCAAAGAGACCGATTTCAGGCTATGACGGATTGAAGGGTCAGAAGGGTGAGCAAGGTGATGTTGGTGAACCAGGTCTTCCTGGAAGACAGGGCTACAGAGGCTTGAAGGGTTACATCGGCTTGCAAGGTTTGACTGGCTTGCCTGGACCTCAGGGATACAAAGGCGAAAAAGGCTTCAGAGGAGCTCCCGGCCTGAACGGTTTAGATGGTTTCCGGGGTCCGCCAGGAGAAAATGGCGACGACGCCCCTCCACCGCCGCCCCCCAAAAGCCGAGGCTTCGTATTCACACGGCATTCCCAATCTATCGCTGTCCCGAGGTGCCCAATCAACACAAACCTCTTGTGGGAGGGCTACTCGTTTGTGTCGGTAATCGGCAGCGGACGTGCCGTCGGCCAAGACTTGGGTCAGTCGGGATCGTGCCTCCGTCAGTTCTCCACCATGCCGTTCATGTTCTGCAACCTGAATAACGTTTGCAGCTATGCAGAAAACAACGACGACAGCATCTGGCTCACGACCGGCGAGCCCATGCCGATGTCAATGACTCCCATTCCGGCCAGAGAGATGGAGAAGTATGTCTCGCGATGCGCTGTGTGCGAAACGACCACGCGCCTCATCGCGCTCCACAGTCAGAGCATGAGCATTCCGGACTGTCCGCAGGGTTGGGAGGAGGCCTGGATCGGCTACAGTTACTACATGCAAACTTCAGACGCCAGCGGAAACTCCCATCAGAACCTCATCTCGCCTGGCTCGTGCTTGGAGGAGTTCAGAGCGCAGCCAGTCATTGAGTGCCACGGCCGAGGCACATGCAACATCTTCGATGGTATCACATCGTTCTGGCTGACAGTCATCGAGGATAGCGAGCAGTTCAGAACGCCGAAGCAGCAAACGTTGAAGGCTGATCAAACTAGCAAAATTAGTCGATGTGCTGTCTGCCGCAAGATGGACAACAGCATCATAGCTCGCGCAGCTAATCGAGTGGTTCTGCCGAATGCGTCCGCCTTCTCCGCGCGCGAGACTGAGACGTATTCGTTTAGACAAAACAACGACGTTTCGGCCATCAATCAGCCAGAGTACACTCTAGTTCAACCGCAGCCGCCACGTGGCCCTCCGCCTCCGCGACGAAGACGGCCAGGTGCGCGAACGAATCCGCGACGACTGAATCGCTCGCGAGACCAGCAGCAAGGC
Protein: 1663 (aa)
MEMPALLRHHGVRGGYGEQGFQGLRGDPGEGGINSKGTKGDRGRDGLDGAPGNPGFDGLPGEKGWPGLDGKNGYDGEKGSKGAKGLPGVKTIFCKGEKGQPFPEVELIQRNSTKIIRGGPGLKGTKGDEGYEGQLGRHGARGQPGVRGLKGYKGAEGNEGDRGKPGKVGPIGASGEKGEKGAPGFAGRDGVKGSKGDKGDDGYEGMPGTQGPAGPPGRYDPNLDEISVGPIGKQGETGEVGEMGVQGIPGEVGRRGLPGDQGPPGDPGTDGERGRQGFTENGQVGDDGEAGPMGSHGQPGYEGARGPKGQKGYPGEDVYGLKGEIGEPGRNGVSDGLAGLRGEPGLPGEKGLPGIGFNITGPPGPDGLPGKQGPPGNPGLDGYDGEKGDKGYRGEDCGFCLDGLPGLKGDRGESGRRGLPGKEGGRGLVGRRGEKGARGRDGLPGYPGLQGTAGQPGLPGPQGEKGKKGQIIYTGNKPKPELGDQGDNGYPGLQGPDGDQGEVGEDGRNGEPGDRGYPGVEGLPGLPGKNGKDGRDGRDGLPGDDAEWFQYGGEPGEPGYHGKKGERGDRGDQGQKGEPGYVPKLILDKRGFKGLPGLDGKKGERGARGDQGDKGERGEEGDVGLRGIPGIIREGPVGAKGYPGLPGNYGPRGEDGRPGLPGEEGLPGIPGIKGSRGEPGAAILYGEMGPDGDEGYPGEKGDKGFRGQIGLRGIPGIRGIKGEKGDIGLEGLIGITGNKGQRGDIVYGDRGIEGLPGPDGRNGMFGLKGHKGDEGITGAPGPQGIRGEDGFEGLIGRDGDAGDVGDTGSLGMIGVMGDFGLPGEKGDIGNSGFPGRRGPIGFMGLKGLQGDKGLPGYPGRNGLKGWRGDYGPHGYKGPKGEAAFSGPKGEVGLQGLQGLPGLNGLPGRRGLKGEEGDAGRIIDGKQGIKGLPGKNGRGGRSGAKGQRGDSGAPGFKGMKGDEGRAGFPGLKGRDGRRGYPGSAGQVGLMGEVGLPGDFGEIGFGIKGERGLIGLIGLPGLQGMPGDKGFSGLPGPVSQGWAEKGDQGDEGFEGVQGRRGGMGRVGEKGDMGPIGDFGIDGRPGLKGQQGIKGFKGERGFAAEWAEPGDEGLAGSDGFPGRQGIRGSKGAPGDYGLDGLPGMVGEQGTSPDGIKGVVGDMGFQGAVGLPGLDGLPGLEGDIGLPGIRGEMGFSSVGIKGERGDVGLPGFDGIDGLNGEPGDEGEPGPIGYRGRKGERGPIGDPGDEGRDGIVGLGGFKGQKGEPYPANLSKRPISGYDGLKGQKGEQGDVGEPGLPGRQGYRGLKGYIGLQGLTGLPGPQGYKGEKGFRGAPGLNGLDGFRGPPGENGDDAPPPPPPKSRGFVFTRHSQSIAVPRCPINTNLLWEGYSFVSVIGSGRAVGQDLGQSGSCLRQFSTMPFMFCNLNNVCSYAENNDDSIWLTTGEPMPMSMTPIPAREMEKYVSRCAVCETTTRLIALHSQSMSIPDCPQGWEEAWIGYSYYMQTSDASGNSHQNLISPGSCLEEFRAQPVIECHGRGTCNIFDGITSFWLTVIEDSEQFRTPKQQTLKADQTSKISRCAVCRKMDNSIIARAANRVVLPNASAFSARETETYSFRQNNDVSAINQPEYTLVQPQPPRGPPPPRRRRPGARTNPRRLNRSRDQQQG
Type | Start | End | Length |
CDS |
5065 |
5086 |
22 |
CDS |
5136 |
5249 |
114 |
CDS |
5324 |
5629 |
306 |
CDS |
5689 |
5877 |
189 |
CDS |
5944 |
6095 |
152 |
CDS |
6163 |
6278 |
116 |
CDS |
6342 |
6396 |
55 |
CDS |
6461 |
6539 |
79 |
CDS |
6597 |
6654 |
58 |
CDS |
6715 |
6739 |
25 |
CDS |
6803 |
6910 |
108 |
CDS |
6983 |
7091 |
109 |
CDS |
7153 |
7171 |
19 |
CDS |
7234 |
7355 |
122 |
CDS |
7415 |
7450 |
36 |
CDS |
7521 |
7574 |
54 |
CDS |
7637 |
7763 |
127 |
CDS |
7825 |
8153 |
329 |
CDS |
8215 |
8313 |
99 |
CDS |
8377 |
8452 |
76 |
CDS |
8516 |
8561 |
46 |
CDS |
8624 |
8732 |
109 |
CDS |
8796 |
8930 |
135 |
CDS |
8987 |
9067 |
81 |
CDS |
9136 |
9199 |
64 |
CDS |
9267 |
9364 |
98 |
CDS |
9421 |
9605 |
185 |
CDS |
9663 |
9725 |
63 |
CDS |
9816 |
9955 |
140 |
CDS |
10131 |
10232 |
102 |
CDS |
10292 |
10401 |
110 |
CDS |
10466 |
10633 |
168 |
CDS |
10695 |
10817 |
123 |
CDS |
10875 |
10964 |
90 |
CDS |
11024 |
11227 |
204 |
CDS |
11290 |
11352 |
63 |
CDS |
11415 |
12427 |
1013 |
intron |
5087 |
5135 |
49 |
intron |
5250 |
5323 |
74 |
intron |
5630 |
5688 |
59 |
intron |
5878 |
5943 |
66 |
intron |
6096 |
6162 |
67 |
intron |
6279 |
6341 |
63 |
intron |
6397 |
6460 |
64 |
intron |
6540 |
6596 |
57 |
intron |
6655 |
6714 |
60 |
intron |
6740 |
6802 |
63 |
intron |
6911 |
6982 |
72 |
intron |
7092 |
7152 |
61 |
intron |
7172 |
7233 |
62 |
intron |
7356 |
7414 |
59 |
intron |
7451 |
7520 |
70 |
intron |
7575 |
7636 |
62 |
intron |
7764 |
7824 |
61 |
intron |
8154 |
8214 |
61 |
intron |
8314 |
8376 |
63 |
intron |
8453 |
8515 |
63 |
intron |
8562 |
8623 |
62 |
intron |
8733 |
8795 |
63 |
intron |
8931 |
8986 |
56 |
intron |
9068 |
9135 |
68 |
intron |
9200 |
9266 |
67 |
intron |
9365 |
9420 |
56 |
intron |
9606 |
9662 |
57 |
intron |
9726 |
9815 |
90 |
intron |
9956 |
10130 |
175 |
intron |
10233 |
10291 |
59 |
intron |
10402 |
10465 |
64 |
intron |
10634 |
10694 |
61 |
intron |
10818 |
10874 |
57 |
intron |
10965 |
11023 |
59 |
intron |
11228 |
11289 |
62 |
intron |
11353 |
11414 |
62 |
Auto annotation result
Program/Analysis | Accession | Description | Score/Expectation |
BLASTP/NCBI-nr |
XP_001846673 |
collagen alpha-2(IV) chain [Culex quinquefasciatus] gb|EDS44340.1| collagen alpha-2(IV) chain [Culex quinquefasciatus] |
0.0 |
InterPro |
IPR016187 |
C-type lectin fold |
|
InterPro |
IPR001442 |
Collagen IV, non-collagenous |
|
InterPro |
IPR008160 |
Collagen triple helix repeat |
|
Gene Ontology(CC) |
GO:0005581 |
collagen |
|
Gene Ontology(MF) |
GO:0005201 |
extracellular matrix structural constituent |
|
Pfam |
PF01391.13 |
Collagen triple helix repeat (20 copies) |
1.6e-171 |
Pfam |
PF01413.14 |
C-terminal tandem repeated domain in type 4 procollagen |
5.4e-80 |
Expression level (RPKM)
Paralog/Ortholog genes
Paralogous genes
Orthologous genes
Species |
Gene ID |
H. sapiens |
ENSP00000380382 |
D. melanogaster |
FBgn0016075 |
P. vanderplanki |
Pv.17128 |
P. humanus |
PHUM136040-PA |
A. mellifera |
GB14564-PA |
B. mori |
BGIBMGA014039-TA |
H. melpomene |
HMEL002288-PA |
T. castaneum |
TC013472 |
S. invicta |
SI2.2.0_02661 |
H. sapiens |
ENSP00000445236 |
H. sapiens |
ENSP00000353654 |
H. sapiens |
ENSP00000443707 |
N. vitripennis |
NV11106-PA |
M. musculus |
ENSMUSG00000031273 |
D. plexippus |
DPOGS206535PA |
H. melpomene |
HMEL002285-PA |
A. gambiae |
AGAP009200 |
P. vanderplanki |
Pv.11194 |
B. mori |
BGIBMGA014040-TA |
D. plexippus |
DPOGS206549PA |
H. sapiens |
ENSP00000378340 |
D. melanogaster |
FBgn0000299 |
C. quinquefasciatus |
CPIJ802363 |
M. musculus |
ENSMUSG00000031502 |
H. sapiens |
ENSP00000334733 |
H. sapiens |
ENSP00000443348 |
S. invicta |
SI2.2.0_09523 |
A. mellifera |
GB13353-PA |
P. humanus |
PHUM136030-PA |
H. sapiens |
ENSP00000364979 |
A. gambiae |
AGAP009201 |
H. sapiens |
ENSP00000361290 |
P. vanderplanki |
Pv.17126 |
T. castaneum |
TC014326 |
C. quinquefasciatus |
CPIJ802355 |