Bonjour à tous,
Je suis confrontée à un problème que je n'arrive pas à résoudre;
j'ai un fichier qui ressemble à cela :

Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
>FBgn0034742 type=gene; loc=2R:complement(18487910..18493140); ID=FBgn0034742; name=CG4294; dbxref=FlyBase:FBan0004294,FlyBase:FBgn0034742,FlyBase_Annotation_IDs:CG4294,GB:AA202210,GB:AA263903,GB:AW941631,UniProt/TrEMBL:Q9W238,GB_protein:AAF46861,INTERPRO:IPR019607,EntrezGene:37579,InterologFinder:37579,BIOGRID:63195,DroID:FBgn0034742,DRSC:FBgn0034742,FLIGHT:FBgn0034742,FlyAtlas:CG4294-RA,FlyMine:FBgn0034742,GenomeRNAi:37579,modMine:FBgn0034742; derived_computed_cyto=58F1-58F1%3B Limits computationally determined from genome sequence between @P{lacW}l(2)k13211<up>k13211</up>@%26@P{EP}EP827@ and @P{PZ}l(2)rG270<up>rG270</up>@; gbunit=AE013599; MD5=de181d96921fe601ef2dbf64c2181d6a; length=5231; release=r5.44; species=Dmel; 
>FBgn0051637 type=gene; loc=2L:6498642..6526996; ID=FBgn0051637; name=CG31637; dbxref=FlyBase:FBgn0051637,FlyBase:FBan0031637,FlyBase_Annotation_IDs:CG31637,GB_protein:AAF52398,FlyBase:FBgn0031827,FlyBase:FBgn0031828,FlyBase:FBgn0062239,GB:AA817254,GB:AI061950,GB:AW943257,GB:AX093983,GB:AY058647,GB_protein:AAL13876,UniProt/TrEMBL:Q9VMC3,INTERPRO:IPR000863,EntrezGene:33914,InterologFinder:33914,BIOGRID:60070,DPIM:FBgn0051637,DroID:FBgn0051637,DRSC:FBgn0051637,FLIGHT:FBgn0051637,FlyAtlas:CG31637-RA,FlyMine:FBgn0051637,GenomeRNAi:33914,modMine:FBgn0051637; derived_computed_cyto=26D9-26E1%3B Limits computationally determined from genome sequence between @P{EP}Sec61α<up>EP2180</up>@ and @P{lacW}Ate1<up>k10809</up>@; gbunit=AE014134; MD5=557958edbe7c9a6a3fb874b35a21df39; length=28355; release=r5.44; species=Dmel; 
>FBgn0085370 type=gene; loc=2L:complement(18532728..18571648); ID=FBgn0085370; name=Pde11; dbxref=FlyBase_Annotation_IDs:CG10231,GB_protein:AAF53676,GB_protein:AAF53675,FlyBase_Annotation_IDs:CG34341,FlyBase:FBgn0085370,FlyBase_Annotation_IDs:CG15159,GB_protein:AAM52774,GB:CZ481948,FlyBase:FBan0010231,UniProt/Swiss-Prot:Q9VJ79,GB:BG635636,GB:CZ489023,GB:CZ489022,GB:CZ473163,GB:CZ478706,INTERPRO:IPR003607,INTERPRO:IPR002073,INTERPRO:IPR003018,GB:AY122262,GB:CZ481559,FlyBase:FBgn0032686,FlyBase:FBgn0032687,FlyBase:FBan0015159,EntrezGene:35107,INTERPRO:IPR023088,INTERPRO:IPR023174,GB_protein:ADV37087,GB_protein:ADV37086,InterologFinder:35107,BIOGRID:61102,DroID:FBgn0085370,DRSC:FBgn0085370,FLIGHT:FBgn0085370,FlyAtlas:CG15159-RA;CG10231-RA,FlyMine:FBgn0085370,GenomeRNAi:35107,modMine:FBgn0085370; derived_computed_cyto=36F6-36F6%3B Limits computationally determined from genome sequence between @P{lacW}Aac11<up>k06710</up>@ and @P{EP}CG10413<up>EP2164</up>@; gbunit=AE014134; MD5=0979ee8fe04ac9fcf62386a2916a9a26; length=38921; release=r5.44; species=Dmel;
Donc l'information que je souhaite récuperer se trouve toujours dans la "variable" dbxref, après "FlyBase_Annotation_IDs:" (en gras) ... mais comme dans l'exemple, cette info ne se trouve pas toujours au meme moment dans la variable, je ne peux donc pas utiliser le morceau de code que j'ai ecris ci dessous ....

Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
while ( my $ligne = <$inp1> ) {
	chomp $ligne;
	if ( $ligne =~ m{^>} ) {
		my ($type, $loc,$ID,$name,$dbxref,$other,$gbunit,$MD5,$length,$release,$species ) = split /;/, $ligne;
		my ($flybase,$flybase2,$CG,undef)=split /,/, $dbxref;
		$ID   =~ s{ *ID=}{};
		$loc   =~ s{ *loc=}{};
		$name =~ s{ *name=}{};
		$CG =~ s{ *}{FlyBase_Annotation_IDs:};
		my ($chr,$coor)=split /:/,$loc;
		print ">$ID;$name;$CG;chr$chr\n";
 
	}
Quelqu'un pourrait m'aider svp ?