A .helixseq file ocntains
two columns specifying the starting and the ending residue number of each
helix, respectively. An example, grec.helixseq, is shown below
<back to How to use it>
11 33 47 67 75 99 103 125 145 163 168 188 224 246 260 281 291 310 315 334 347 373 380 399
A .aa file contains one-letter
codes of the amino acid sequences of the reference protein. VHMPT
will skip lines containing only sequence numbers or starting with a '#'
sign. An example, grec.aa, is shown below,
#grec.aa file # this file contains the amino acid sequence of the protein "grec" # 1 50 MYYLKNTNFWMFGLFFFFYFFIMGAYFPFFPIWLHDINHISKSDTGIIFA AISLFSLLFQPLFGLLSDKLGLRKYLLWIITGMLVMFAPFFIFIFGPLLQ YNILVGSIVGGIYLGFCFNAGAPAVEAFIEKVSRRSNFEFGRARMFGCVG WALCASIVGIMFTINNQFVFWLGSGCALILAVLLFFAKTDAPSSATVANA VGANHSAFSLKLALELFRQPKLWFLSLYVIGVSCTYDVFDQQFANFFTSF FATGEQGTRVFGYVTTMGELLNASIMFFAPLIINRIGGKNALLLAGTIMS VRIIGSSFATSALEVVILKTLHMFEVPFLLVGCFKYITSQFEVRFSATIY LVCFCFFKQLAMIFMSVLAGNMYESIGFQGAYLVLGLVALGFTLISVFTL 1 17 SGPGPLSLLRRQVNEVA ################################################### #total 417 a.a.<back to How to use it>
A .msf file is a multiple
sequence alignment file generated by GCG. An example of such a file, grec.msf,
is shown below,
!!AA_MULTIPLE_ALIGNMENT 1.0 PileUp of: @7s.lst Symbol comparison table: GenRunData:blosum62.cmp CompCheck: 6430 GapWeight: 12 GapLengthWeight: 4 7s.msf MSF: 428 Type: P March 20, 1998 11:01 Check: 5958 .. Name: GREC Len: 428 Check: 4158 Weight: 1.00 Name: JC2544 Len: 428 Check: 3354 Weight: 1.00 Name: JT0487 Len: 428 Check: 4253 Weight: 1.00 Name: B43717 Len: 428 Check: 1155 Weight: 1.00 Name: GRECST Len: 428 Check: 3038 Weight: 1.00 // 1 50 GREC ~~~~~MYYLK NTNFWMFGLF FFFYFFIMGA YFPFFPIWLH DINHISKSDT JC2544 ~~~~~MYYLK NTNFWMFGFF FFFYFFIMGA YFPFFPIWLH EVNHISKGDT JT0487 MKLSELAPRE RHNFIYFMLF FFFYYFIMSA YFPFFPVWLA EVNHLTKTET B43717 ~~MNSASTHK NTDFWIFGLF FFLYFFIMAT CFPFLPVWLS DVVGLSKTDT GRECST ~~MALNIPFR NAYYRFASSY SFLFFISWSL WWSLYAIWLK GHLGLTGTEL 51 100 GREC GIIFAAISLF SLLFQPLFGL LSDKLGLRKY LLWIITGMLV MFAPFFIFIF JC2544 GIIFACISLF SLLFQPIFGL LSDKLGLRKH LLWVITGMLV MFAPFFIYVF JT0487 GIVFSCISLF AIIFQPVFGL ISDKLGLRKH LLWTITILLI LFAPFFIFVF B43717 GIVFSCLSLF AISFQPLLGV ISDRLGLKKN LIWSISLLLV FFAPFFLYVF GRECST GTLYSVNQFT SILFMMFYGI VQDKLGLKKP LIWCMSFILV LTGPFMIYVY 101 150 GREC GPLLQYNILV GSIVGGIYLG FCFNAGAPAV EAFIEKVSRR SNFEFGRARM JC2544 GPLLQVNILL GSIVGGIYLG FIYNAGAPAI EAYIEKASRR SNFEFGRARM JT0487 SPLLQMNIMA GALVGGVYLG IVFSSGSGAV EAYIERVSRA NRFEYGKVRV B43717 APLLHLNIWA GALTGGVFIG FVFSAGAGAI EAYIERVSRS SGFEYGKARM GRECST EPLLQSNFSV GLILGALFFG LGYLAGCGLL DSFTEKMARN FHFEYGTARA 151 200 GREC FGCVGWALCA SIVGIMFTIN NQFVFWLGSG CALILAVLLF FAKTDAPSSA JC2544 FGCVGWALCA SIAGIMFTIN NQFVFWLGSG CAVILALLLL FSKTDVPSSA JT0487 SGCVGWALCA SITGILFSID PNITFWIASG FALILGVLLW VSKPESSNSA B43717 FGCLGWALCA TMAGILFNVD PSLVFWMGSG GALLLLLLLY LARPSTSQTA GRECST WGSFGYAIGA FFAGIFFSIS PHINFWLVSL FGAVFMMINM RFK.DKDHQC 201 250 GREC TVANAVGANH SAFSLKLALE LFRQPKLWFL SLYVIGVSCT YDVFDQQ.FA JC2544 KVADAVGANN SAFSLKLALE LFKQPKLWLI SLYVVGVSCT YDVFDQQ.FA JT0487 EVIDALGANR QAFSMRTAAE LFRMPRFWGF IIYVVGVASV YDVFDQQ.FA B43717 MVMNALGANS SLISTRMVFS LFRMRQMWMF VLYTIGVACV YDVFDQQ.FA GRECST IAADAGGVKK EDF.....IA VFKDRNFWVF VIFIVGTWSF YNIFDQQLFP 251 300 GREC NFFTSFFATG EQGTRVFGYV TTMGELLNAS IMFFAPLIIN RIGGKNALLL JC2544 NFFTSFFATG EQGTRVFGYV TTMGELLNAS IMFFAPLIVN RIGGKNALLL JT0487 NFFKGFFSSP QRGTEVFGFV TTGGELLNAL IMFCAPAIIN RIGAKNALLI B43717 IFFRSFFDTP QAGIKAFGFA TTAGEICNAI IMFCTPWIIN RIGAKNTLLV GRECST VFYAGLFESH DVGTRLYGYL NSFQVVLEAL CMAIIPFFVN RVGPKNALLI 301 350 GREC AGTIMSVRII GSSFATSALE VVILKTLHMF EVPFLLVGCF KYITSQFEVR JC2544 AGTIMSVRII GSHSHT.ALE VVILKTLHMF EIPFLIVGCF KYITSQFEVR JT0487 AGLIMSVRIL GSSFATSAVE VIILKMLHMF EIPFLLVGTF KYISSAFKGK B43717 AGGIMTIRIT GSAFATTMTE VVILKMLHAL EVPFLLVGAF KYITGVFDTR GRECST GVVIMALRIL SCALFVNPWI ISLVKLLHAI EVPLCVISVF KYSVANFDKR 351 400 GREC FSATIYLVCF CFFKQLAMIF MSVLAGNMYE SIGFQGAYLV LGLVALGFTL JC2544 FSATIYLVCF CFFKQLAMIF MSVLAGKMYE SIGFQGAYLV LGIIRVSFTL JT0487 LSATLFLIGF NLSKQLSSVV LSAWVGRMYD TVGFHQAYLI LGCITLSFTV B43717 LSATVYLIGF QFSKQLAAIL LSTFAGHLYD RMGFQNTYFV LGMIVLTVTV GRECST LSSTIFLIGF QIASSLGIVL LSTPTGILFD HAGYQTVFFA ISGIVCLMLL 401 428 GREC ISVFTLSGPG PLSLLRRQVN EVA~~~~~ JC2544 ISVFTLSGPG PFSLLRRRES VAL~~~~~ JT0487 ISLFTLKGSK TLLPATA~~~ ~~~~~~~~ B43717 ISAFTLSSSP GIVHPSVEKA PVAHSEIN GRECST FGIFFLSKKR EQIVMETPVP SAI~~~~~<back to How to use it>
Appendix D
| F1: | the amino acid sequence number. |
| F2: | the consensus amino acid (it can be a gap). |
| F3: | the number of the consensus amino acid in that position. |
| F4: | the number of gaps in that position. |
| F5: | the normalized conservation score in that position (1 being strictly conserved). |
| F6: | the amino acid of the reference protein. |
| F7: | all the amino acids in that aligned position. |
1 M 2 0 0.53 M MMLAN 2 Y 2 0 0.32 Y YYASI 3 P 2 0 0.23 Y YYPTP 4 L 2 0 0.48 L LLRHF 5 K 3 0 0.75 K KKEKR 6 N 4 0 0.81 N NNRNN 7 T 3 0 0.64 T TTHTA 8 N 3 0 0.68 N NNNDY 9 F 4 0 0.99 F FFFFY 10 W 3 0 0.72 W WWIWR 11 M 2 0 0.53 M MMYIF 12 F 4 0 0.73 F FFFFA 13 G 3 0 0.64 G GGMGS 14 L 3 0 0.71 L LFLLS 15 F 4 0 0.99 F FFFFY 16 F 4 0 0.76 F FFFFS 17 F 5 0 1.00 F FFFFF 18 F 3 0 0.92 F FFFLL 19 Y 4 0 0.99 Y YYYYF 20 F 4 0 0.99 F FFYFF 21 F 4 0 0.89 F FFFFI 22 I 4 0 0.79 I IIIIS 23 M 4 0 0.76 M MMMMW 24 G 2 0 0.65 G GGSAS 25 A 3 0 0.64 A AAATL ............ ............ ............