Voila je me suis fait un ti script pour lire les fichiers robots.txt , parce que je n'est pas réussie a faire a www::robotrules me sortire dans url complète :?
Voci mon code
je cherche a recontruire les url dans le cas de google a obtnir ceciCode:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 #!C:\Perl\bin\perl.exe @EE = ("http://www.google.com/robots.txt"); @EE = find(@EE); sub find(@) { my @EE = @_; use LWP::Simple; for my $URL( @EE ) { my $B = get($URL); return $B; } } print "@EE"; print "\n"; print "----------------------------------------\n"; for $T( @EE ) { $T =~ s/Allow.*//g; print "$T"; }
Citation:
http://www.google.com/search
http://www.google.com/groups
http://www.google.com/images
http://www.google.com/catalogs
http://www.google.com/catalog_list
http://www.google.com/news
http://www.google.com/nwshp
http://www.google.com/?
http://www.google.com/addurl/image?
http://www.google.com/pagead/
http://www.google.com/relpage/
http://www.google.com/sorry/
http://www.google.com/imgres
http://www.google.com/keyword/
http://www.google.com/u/
http://www.google.com/univ/
http://www.google.com/cobrand
http://www.google.com/custom
http://www.google.com/advanced_group_search
http://www.google.com/advanced_search
http://www.google.com/googlesite
http://www.google.com/preferences
http://www.google.com/setprefs
http://www.google.com/swr
http://www.google.com/url
http://www.google.com/wml?
http://www.google.com/xhtml?
http://www.google.com/imode?
http://www.google.com/jsky?
http://www.google.com/pda?
http://www.google.com/sprint_xhtml
http://www.google.com/sprint_wml
http://www.google.com/pqa
http://www.google.com/palm
http://www.google.com/hws
http://www.google.com/bsd?
http://www.google.com/linux?
http://www.google.com/mac?
http://www.google.com/microsoft?
http://www.google.com/unclesam?
http://www.google.com/answers/search?q=
http://www.google.com/local?
http://www.google.com/local_url
http://www.google.com/froogle?
http://www.google.com/froogle_
http://www.google.com/print?
http://www.google.com/scholar?
http://www.google.com/complete
http://www.google.com/sponsoredlinks
http://www.google.com/videosearch?
http://www.google.com/videopreview?
http://www.google.com/videoprograminfo?
http://www.google.com/maps?
http://www.google.com/translate?
http://www.google.com/ie?
http://www.google.com/sms/demo?
J'ai testé divers soluce a base de regXp mais la je rame trop donc Help