USE Python 2.7(screen shot program with output)
the task is: takes in a list of protein sequences as inputand finds all the transmembrane domains and returns them ina list for each sequence in the list with given nonpolar regionsand returns the lists for those.
1. This code should call two other functions that you write:regionProteinFind takes in a protein sequence andshould return a list of 10 amino acid windows, if the sequence isless than 10 amino acids, it should just return that sequence.(initially it should grab amino acids 1-10…the next time itis called it should grab amino acids 2-11…) for eachsequence in the list.
testcode:
\"protein='MKLVVRPWAGCWWSTLGPRGSLSPLGICPLLMLLWATLR''
the regionProteinFindreturns:['MKLVVRPWAG','KLVVRPWAGC','LVVRPWAGCW','VVRPWAGCWW','VRPWAGCWWS','RPWAGCWWST','PWAGCWWSTL','WAGCWWSTLG','AGCWWSTLGP','GCWWSTLGPR',
'CWWSTLGPRG','WWSTLGPRGS','WSTLGPRGSL','STLGPRGSLS','TLGPRGSLSP','LGPRGSLSPL','GPRGSLSPLG','PRGSLSPLGI','RGSLSPLGIC','GSLSPLGICP','SLSPLGICPL','LSPLGICPLL','SPLGICPLLM','PLGICPLLML','LGICPLLMLL','GICPLLMLLW','ICPLLMLLWA','CPLLMLLWAT','PLLMLLWATL','LLMLLWATLR']
2nd testcode;
protein=MP
region protein sequence should return: ['ME']
2. A second function called testForTM , whichshould calculate and return the decimal fraction of ten amino acidwindow which are nonpolar for each sequence in the list. thenonpolar regions are (A,V,L,I,P,M,F,W). my code for this is:
def testForTM(AAWindow):
totalNP= 0
nonPolarList=['A', 'V', 'L', 'I', 'P', 'M', 'F', 'W']
for aa in AAWindow: Â Â
if aa in nonPolarList: Â Â
totalNP+=1
return totalNP/10.0 #THIS SHOULD DEVIDE BY len(AAWindow) soit works for sequences less than 10 length like 'MP'
3. The last function,tmSCANNER should call the get proteinregion and test for TM and Ultimately, as a result the code shouldbe used to scan each protein sequence in the list as inputgenerating list of numbers of non polar for each protein sequencewhich measures the fraction of nonpolar residues in each 10bpwindow(it slides 10 amino acids at a time until it is at the lastaa window of a protein sequence with any length and give the listsfor those. The code should output what is displayed below.
#Test code for TMFinder
input=>listOfProtein=['MKLVVRPWAGCWWSTLGPRGSLSPLGICPLLMLLWATLR','MARKCSVPLVMAWLTWTTSRAPLPH','MPWPTSITXXXXXXSWSPEWLSSGLRSILGWEQPRVSHKGHSHEWHRRP']
tmValuesList=TMFinder(listOfProtein)
print 'The list of TM values are:', tmValuesList
as a result it should print out this list:
[\"protein 1:'MKLVVRPWAGCWWSTLGPRGSLSPLGICPLLMLLWATLR'\",'TMValue:[0.7, 0.6, 0.7, 0.7, 0.6, 0.5, 0.6, 0.5, 0.5, 0.4, 0.4,0.4, 0.4, 0.3, 0.4, 0.5, 0.4, 0.5, 0.4, 0.5, 0.6, 0.7, 0.7, 0.8,0.8, 0.8, 0.9, 0.8, 0.9,0.8]',\"protein2:'MARKCSVPLVMAWLTWTTSRAPLPH'\", 'TMValue:[0.6, 0.6,0.6, 0.7, 0.8, 0.8, 0.9, 0.8, 0.7, 0.6, 0.5, 0.5, 0.5, 0.5, 0.5,0.5]',\"protein3:'MPWPTSITXXXXXXSWSPEWLSSGLRSILGWEQPRVSHKGHSHEWHRRP'\",'TMValue:[0.5,0.4, 0.3, 0.2, 0.1, 0.1, 0.2, 0.1, 0.2, 0.2, 0.3, 0.4, 0.4, 0.4,0.4, 0.5, 0.4, 0.4, 0.4, 0.5, 0.4, 0.4, 0.4, 0.4, 0.5, 0.4, 0.5,0.5, 0.4, 0.3, 0.3, 0.2, 0.2, 0.2, 0.1, 0.2, 0.1, 0.1,0.1]']
This is time sensitive.Thank you for thehelp!!!