일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- AP
- 인공지능
- 서열정렬
- 알파폴드
- 캐글
- 시그모이드
- bioinformatics
- 자바
- 바이오파이썬
- 오류역전파
- BLaST
- 바이오인포매틱스
- 생물정보학
- MERS
- CNN
- 이항분포
- COVID
- Java
- 딥러닝
- 인공신경망
- Kaggle
- ncbi
- 인공지능 수학
- 파이썬
- 행렬
- 생명정보학
- 결정트리
- 블록체인
- SVM
- AP Computer Science A
- Today
- Total
데이터 과학
단백질 구조 예측, CF & GOR 방법 본문
Chou and Fasman (CF) 방법은 오래전에 연구된 단백질 구조 예측을 위한 방법입니다.
1974년에 연구된 이 방법은 아미노산의 특성을 연구하여 확률적으로 단백질 구조 결과를 나타냅니다.
https://pubs.acs.org/doi/10.1021/bi00699a002
버지니아 대학에서 FASTA 서열을 입력하여 CF 결과를 나타내는 사이트를 제공합니다.
https://fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=misc1
Misc Protein Analysis
fasta.bioch.virginia.edu
위 사이트에 아래 FASTA 예제를 입력해 봅시다.
>AYV99761.1 spike [SARS coronavirus Urbani]
MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFH
TINHTFGNPVIPFKDGIYFAATEKSNVVRGWVFGSTMNNKSQSVIIINNSTNVVIRACNFELCDNPFFAV
SKPMGTQTHTMIFDNAFNCTFEYISDAFSLDVSEKSGNFKHLREFVFKNKDGFLYVYKGYQPIDVVRDLP
SGFNTLKPIFKLPLGINITNFRAILTAFSPAQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCSQ
NPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPSVYAWERKKISNCVA
DYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQTGVIADYNYKLPDDFMGCV
LAWNTRNIDATSTGNHNYKYRYLRHGKLRPFERDISNVPFSPDGKPCTPPALNCYWPLNDYGFYTTTGIG
YQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQFGRDVSDFTD
SVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAIHADQLTPAWRIYSTGNNVFQ
TQAGCLIGAEHVDTSYECDIPIGAGICASYHTVSLLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTNF
SISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQM
YKTPTLKYFGGFNFSQILPDPLKPTKRSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGL
TVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQIANQFN
KAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLIT
GRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYV
PSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGNCDVVIGIINNTVY
DPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQ
YIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT
Submit Sequence를 선택하면 결과가 나타납니다.
. . . . . .
MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSDTLYLTQDLFL
helix <-------> <-------> <-------------
sheet EEEEEEEEEE EEEEEEEEEE EEEEEE EEEEE EEEEEEEEEEEE
turns TTT T T T TT T
. . . . . .
PFYSNVTGFHTINHTFGNPVIPFKDGIYFAATEKSNVVRGWVFGSTMNNKSQSVIIINNS
helix > <---------------------> <----->
sheet EEEEEEEEEEEEEEE EEE EEEEEEEEEE EEEEEEEEE
turns T T TT T T T T
. . . . . .
TNVVIRACNFELCDNPFFAVSKPMGTQTHTMIFDNAFNCTFEYISDAFSLDVSEKSGNFK
helix <-------> <-----> <-------> <-----------------
sheet EEEEEEEEEEE EEEEEEEEEEEEEEEEEEEEEEEEE
turns T T T T T T
. . . . . .
HLREFVFKNKDGFLYVYKGYQPIDVVRDLPSGFNTLKPIFKLPLGINITNFRAILTAFSP
helix ----------------> <------> <--------> <----------
sheet EEEEE EEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEEEEEE
turns T T T T
. . . . . .
AQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCSQNPLAELKCSVKSFEIDKGIY
helix ----------------> <-----> <------------------------->
sheet EEEE EEEEEEEEEEEEEE EEE EE
turns T T TTT TT T
. . . . . .
QTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPSVYAWERKKISNCVADYSVLYNSTF
helix <------------------> <
sheet EE EEEEEEEE EEEEEEEEEEEE
turns TT TT TT T
. . . . . .
FSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQTGVIADYNYKLPDDFMGCV
helix -------------> <---------------------> <---------
sheet EEEEEEEEEEEEEEEEEEEEEE EEEEEEEEE EEEEE
turns T TT T T
. . . . . .
LAWNTRNIDATSTGNHNYKYRYLRHGKLRPFERDISNVPFSPDGKPCTPPALNCYWPLND
helix -> <----->
sheet EEEEE EEEEE EEEEEEEEE
turns TT T T T T
. . . . . .
YGFYTTTGIGYQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTP
helix <-----------> <------>
sheet EEEEEEEEEEEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEEEEEEEEEEE
turns T T T T
. . . . . .
SSKRFQPFQQFGRDVSDFTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQD
helix <------> <--------------> <--------->
sheet EEEEEE EEEEE EEEEEEE
turns T T T T TT T T T
. . . . . .
VNCTDVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAEHVDTSYECDIPIGAGICASY
helix <----------> <------------->
sheet EEEEEEEEEEEEEEE EEEEEE EEEEEEEEE EEE
turns T T T T T T T
. . . . . .
HTVSLLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVMPVSMAKTSVDC
helix <-------------> <------------->
sheet EEEEEE EEEEEEE EEEEEEEEEEEEEEEEEEE EEE
turns T TT T TT T
. . . . . .
NMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQMYKTPTLKYFG
helix <-----> <-----------> <-------->
sheet EEEE EEEEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEE
turns T T T T T
. . . . . .
GFNFSQILPDPLKPTKRSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGL
helix <----> <-----------------------> <---------->
sheet EEEEEEE EEEEEEEEEE EEEEEEE EEEEEEEEEEE
turns T T T T T T
. . . . . .
TVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYE
helix <----------------> <-------------------> <-------
sheet EEEEEEEEEEEEEEEEEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEEEEEE
turns T T T T
. . . . . .
NQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLN
helix ----------------------------------------------> <------
sheet EEEEEEEEEEEEEEEEEEEEEE EEEEEEEEEEEEEEEEEEEE EEE
turns T T T T T T T
. . . .
DILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEI
helix -------------------------------------->
sheet EEE EEEEEEEEEEEEEEEEEEEEEEE
turns T T T
Residue totals: H:606 E:592 T:105
percent: H: 60.6 E: 59.2 T: 10.5
CF 방법으로는 헬릭스 값이 60.6 정도 나오고 flat 구조는 59.2이며 코일은 10.5입니다.
다시 메뉴로 돌아와서 이번에는 가니어 알고리즘을 선택합니다.
가니어 알고리즘이 GOR 방법입니다. GOR은 Garnier, Osguthorpe and Robson 공동저자의 약자입니다.
GOR은 GOR III까지 나와 있는데 CF 알고리즘의 구조예측률이 50%~60% 정도 선까지 나타난다면 GOR III 방법은 70% 이상을 상회합니다.
https://www.sciencedirect.com/science/article/abs/pii/0022283678902978
GOR 메소드로 실험한 결과를 살펴보면 다음과 같습니다.
. 10 . 20 . 30 . 40 . 50 . 60
MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSDTLYLTQDLFL
helix HHHHHHHH
sheet EE E E EE EEE EEEEEEE
turns TTTTT T TT TTT TTTT TTTT T TTT T
coil CC C CCC CCC C CC
. 70 . 80 . 90 . 100 . 110 . 120
PFYSNVTGFHTINHTFGNPVIPFKDGIYFAATEKSNVVRGWVFGSTMNNKSQSVIIINNS
helix HHHHHH
sheet EEE EEEE EEEEEEE EEEEEE
turns TTT T TTT TTTTT TT TTT TT
coil C C CCC C CCCCC C CCC
. 130 . 140 . 150 . 160 . 170 . 180
TNVVIRACNFELCDNPFFAVSKPMGTQTHTMIFDNAFNCTFEYISDAFSLDVSEKSGNFK
helix HHHH H H HHHHHHHHHH HHH
sheet EEEEE E EEEEEEE EE
turns TT TTTTTTT T TTTTTTTT T TT
coil C CCC C
. 190 . 200 . 210 . 220 . 230 . 240
HLREFVFKNKDGFLYVYKGYQPIDVVRDLPSGFNTLKPIFKLPLGINITNFRAILTAFSP
helix HHHHHHHH
sheet EEEE EEEEEEEE EEEEEEEE EEEEEEEEE
turns TTTT TTTT TTT TT
coil CC CC C CCCCC
. 250 . 260 . 270 . 280 . 290 . 300
AQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCSQNPLAELKCSVKSFEIDKGIY
helix HHHHHHHHHHHHH
sheet EEEEE EEEEE EEEEE
turns TTTTT T T TTTTT TTT TTT
coil CCCCC CCC C CCC CC
. 310 . 320 . 330 . 340 . 350 . 360
QTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPSVYAWERKKISNCVADYSVLYNSTF
helix HHHH HHHHHH
sheet EEEEE EEEE EEEEEEEEEE
turns TTT TTT TTTTT TT TT TTTT T TT
coil C CC C CCCC C
. 370 . 380 . 390 . 400 . 410 . 420
FSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQTGVIADYNYKLPDDFMGCV
helix HH
sheet E EEEE EEEEEE EEEE EEEE EEEEEE EEEEE
turns TTTTTTTT T T T TTTT T TTTT TTT
coil C CC C C
. 430 . 440 . 450 . 460 . 470 . 480
LAWNTRNIDATSTGNHNYKYRYLRHGKLRPFERDISNVPFSPDGKPCTPPALNCYWPLND
helix HHH
sheet EEEEE EEE EE E EEE
turns TT T TTT TTT TT TTTT TT TTT TT TTT T TTT
coil CC C CCC C CCC CC C C
. 490 . 500 . 510 . 520 . 530 . 540
YGFYTTTGIGYQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTP
helix HHH H
sheet EEEE EEEEEEEEE EEEE EEEE EEEEEEE EEEE
turns TT TTTT T TT T TTTT
coil CC CC C CCC CC
. 550 . 560 . 570 . 580 . 590 . 600
SSKRFQPFQQFGRDVSDFTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQD
helix
sheet EE EEEE E EEEEE EEEEEE
turns TTTTT TTTTTTT TTTT T T T TTTTTT TT
coil CC C CCC C C CCCCCCC
. 610 . 620 . 630 . 640 . 650 . 660
VNCTDVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAEHVDTSYECDIPIGAGICASY
helix HHH HH
sheet E EEE EEE EEEE EE EEE EEEEE
turns TTTTT TT T TTTT TTT TTTTTTT TTTT
coil CCC CCC CC
. 670 . 680 . 690 . 700 . 710 . 720
HTVSLLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVMPVSMAKTSVDC
helix HHHHHHHH
sheet EEEEEE EEEEEEEE EEEE EEEE EEEEE
turns T T T TTTT T T TTT
coil C C C CCCC C C CCCC
. 730 . 740 . 750 . 760 . 770 . 780
NMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQMYKTPTLKYFG
helix HHHHHH HHH HHHHHHHHH
sheet EEE EEEEEE EE EEEEEEE
turns TTTT TTTT TT TT TT TT
coil CC CCCC CC
. 790 . 800 . 810 . 820 . 830 . 840
GFNFSQILPDPLKPTKRSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGL
helix HHHHHHHHHHHHHHHHH HHHHHHHH
sheet E EEEE EE E
turns TT TTT T TTTTTTTT TTTT
coil C CCCCCCC C
. 850 . 860 . 870 . 880 . 890 . 900
TVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYE
helix HHHHHHHHHHHH HHHHHHHHHHHH H
sheet EEEEEE EEE EEEE
turns T TTTT
coil CCCCC CCCCCCC CCC CC
. 910 . 920 . 930 . 940 . 950 . 960
NQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLN
helix HHHHHHHH HHHH
sheet EEEEE EEEEEEEE EEEEEEE EEEEEEE
turns T T T
coil CCCCC CC CCCCCC C CCC C
. 970 . 980 . 990 . 1000 . 1010 . 1020
DILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSK
helix H HHHHHHHHHHHHHH HHHHHHHHHHHHHHHHHHHHHH
sheet E EEE EEEEEEEEE
turns TT T TTT
coil C C CC
. 1030 . 1040 . 1050 . 1060 . 1070 . 1080
RVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAICHEGKAYFPREGVFVFN
helix HHHHH
sheet EEE EEE EEEEEE E EEEE
turns TTTTTT T T T TTTTT TTTT T T T
coil C CC CCC C CCCCC CC CCC
. 1090 . 1100 . 1110 . 1120 . 1130 . 1140
GTSWFITQRNFFSPQIITTDNTFVSGNCDVVIGIINNTVYDPLQPELDSFKEELDKYFKN
helix HHHHHHHHHHHHHH
sheet EEE EEEE EEEEEEEE EE
turns TTTT T TTTTTTTTT TTTT TT
coil CCCC CCC CC
. 1150 . 1160 . 1170 . 1180 . 1190 . 1200
HTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYVWL
helix H HHHHHHHHHHHHHHHHHHHHHHH HH
sheet EEEEEE EEE E E E
turns T T T T TTT TT TT
coil CCC CCCCCC C C
. 1210 . 1220 . 1230 . 1240 . 1250
GFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT
helix HHHHH H HHHHHHHHHH
sheet EEEEE EE E EEEE
turns TT TTTTTTTTTTTTTTTTTTTT TTTT
coil C
Residue totals: H:264 E:393 T:374 C:224
percent: H: 21.3 E: 31.7 T: 30.2 C: 18.1
'생명정보학 & 화학정보학 > 알파폴드와 단백질 구조 예측' 카테고리의 다른 글
pymol (단백질 구조 뷰어 프로그램) (0) | 2024.10.22 |
---|---|
알파폴드 실습 - 코랩폴드 (2) | 2022.11.14 |
단백질 구조 예측 서론과 알파폴드 설치 (2) | 2022.10.17 |
아미노산 구조 (0) | 2022.10.11 |