Bonjour,

J'ai un fichier tab, je voudrais chercher des mots clés, et extraire la réponse dans un fichier, mon fichier tab est mal organisé et il est plein d 'espace.

Voilà le fichier en question :
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
Entry	Entry name	Status	Protein names	Gene names	Organism	Length	Protein existence	Absorption	Enzyme regulation	pH dependence	DNA binding	Site	Nucleotide binding	Fragment	Gene encoded by	Alternative products (isoforms)	Mass spectrometry	Polymorphism	RNA editing	Sequence caution	Mass	Sequence	Proteomes	Pathway	Active site	Binding site	Catalytic activity	EC number	Cofactor	Function [CC]	Kinetics	Redox potential	Temperature dependence	Calcium binding	Metal binding
P04637	P53_HUMAN	reviewed	Cellular tumor antigen p53 (Antigen NY-CO-13) (Phosphoprotein p53) (Tumor suppressor p53)	TP53 P53	Homo sapiens (Human)	393	Evidence at protein level				DNA_BIND 102 292	SITE 120 120 Interaction with DNA.				ALTERNATIVE PRODUCTS:  Event=Alternative promoter usage, Alternative splicing; Named isoforms=9;  Name=1; Synonyms=p53, p53alpha; IsoId=P04637-1; Sequence=Displayed; Name=2; Synonyms=I9RET, p53beta; IsoId=P04637-2; Sequence=VSP_006535, VSP_006536; Note=Expressed in quiescent lymphocytes. Seems to be non-functional. May be produced at very low levels due to a premature stop codon in the mRNA, leading to nonsense-mediated mRNA decay.; Name=3; Synonyms=p53gamma; IsoId=P04637-3; Sequence=VSP_040560, VSP_040561; Note=Expressed in quiescent lymphocytes. Seems to be non-functional. May be produced at very low levels due to a premature stop codon in the mRNA, leading to nonsense-mediated mRNA decay.; Name=4; Synonyms=Del40-p53, Del40-p53alpha, p47; IsoId=P04637-4; Sequence=VSP_040832; Name=5; Synonyms=Del40-p53beta; IsoId=P04637-5; Sequence=VSP_040832, VSP_006535, VSP_006536; Name=6; Synonyms=Del40-p53gamma; IsoId=P04637-6; Sequence=VSP_040832, VSP_040560, VSP_040561; Name=7; Synonyms=Del133-p53, Del133-p53alpha; IsoId=P04637-7; Sequence=VSP_040833; Note=Produced by alternative promoter usage.; Name=8; Synonyms=Del133-p53beta; IsoId=P04637-8; Sequence=VSP_040833, VSP_006535, VSP_006536; Note=Produced by alternative promoter usage and alternative splicing.; Name=9; Synonyms=Del133-p53gamma; IsoId=P04637-9; Sequence=VSP_040833, VSP_040560, VSP_040561; Note=Produced by alternative promoter usage and alternative splicing.; 					43,653	MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELPPGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD	UP000005640: Chromosome 17						COFACTOR: Name=Zn(2+); Xref=ChEBI:CHEBI:29105; ;  Note=Binds 1 zinc ion per subunit.;	FUNCTION: Acts as a tumor suppressor in many tumor types; induces growth arrest or apoptosis depending on the physiological circumstances and cell type. Involved in cell cycle regulation as a trans-activator that acts to negatively regulate cell division by controlling a set of genes required for this process. One of the activated genes is an inhibitor of cyclin-dependent kinases. Apoptosis induction seems to be mediated either by stimulation of BAX and FAS antigen expression, or by repression of Bcl-2 expression. In cooperation with mitochondrial PPIF is involved in activating oxidative stress-induced necrosis; the function is largely independent of transcription. Induces the transcription of long intergenic non-coding RNA p21 (lincRNA-p21) and lincRNA-Mkln1. LincRNA-p21 participates in TP53-dependent transcriptional repression leading to apoptosis and seem to have to effect on cell-cycle regulation. Implicated in Notch signaling cross-over. Prevents CDK7 kinase activity when associated to CAK complex in response to DNA damage, thus stopping cell cycle progression. Isoform 2 enhances the transactivation activity of isoform 1 from some but not all TP53-inducible promoters. Isoform 4 suppresses transactivation activity and impairs growth suppression mediated by isoform 1. Isoform 7 inhibits isoform 1-mediated apoptosis. Regulates the circadian clock by repressing CLOCK-ARNTL/BMAL1-mediated transcriptional activation of PER2 (PubMed:24051492). {ECO:0000269|PubMed:11025664, ECO:0000269|PubMed:12810724, ECO:0000269|PubMed:15186775, ECO:0000269|PubMed:15340061, ECO:0000269|PubMed:17317671, ECO:0000269|PubMed:17349958, ECO:0000269|PubMed:19556538, ECO:0000269|PubMed:20673990, ECO:0000269|PubMed:20959462, ECO:0000269|PubMed:22726440, ECO:0000269|PubMed:24051492, ECO:0000269|PubMed:9840937}.					METAL 176 176 Zinc.; METAL 179 179 Zinc.; METAL 238 238 Zinc.; METAL 242 242 Zinc.
Et voilà mon bout de code que je viens de commencer , mais il m'affiche tout le fichier !

Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
 
chaine = "Cellular"
f = open("uniprot-id%3AP04637.tab", "r")
for line in f:
	if chaine in line:
		print line
 
f.close()