1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
| #!/usr/bin/perl
use strict;
use HTML::Parser;
#package MyParser;
#use base qw(HTML::Parser);
#my $p = new HTML::Parser;
# $p->parse_file("backup.html");
my $parser = HTML::Parser->new();
# définition des evenements
$parser->handler( text => \&text, "dtext" );
$parser->handler( start => \&start, "tagname,attr" );
$parser->handler( end => \&end, "tagname" );
my $count_td;
my $count_a;
my @data;
sub start {
my ($tag, $attr,$attrseq, $origtext) = @_;
$count_td++;
if( $tag eq 'td'){
#print "ne rien faire \n";
}
elsif ($tag eq 'a'and $attr->{href} eq '/[a_zA-Z0-9_]/' ){
$count_a++
}
}
sub end {
my ($tag) = @_;
$count_td--;
$count_a--;
if( $tag eq 'td' and $count_td){
#print " td \n";
}
elsif ($tag eq 'a' and $count_a){
#print "a \n";
}
}
sub text {
my ($text) = @_;
if ($count_td){
push @data, $text ;
print " $text \n";
}
elsif ($count_a){
push (@data, $text);
print" $text \n";
}
print @data;
return @data;
}
print @data ;
# package main;
my $html = <<EOHTML;
#<html>
<tr>
<td>1</td>
<td>
<a href="#objdef-id3206X4432">mx1.messagelabs.com</a><br>
</td>
<td>
<a href="#objdef-id1856X4432">MX_servers</a><br>
</td>
<td>smtp<br>
</td>
<td>Accept</td>
<td><br>
</td>
</tr>
<tr>
<td>2</td>
<td>
<a href="#objdef-id1856X4432">MX_servers</a><br>
</td>
<td>Any<br>
</td>
<td>smtp<br>
</td>
<td>Accept</td>
<td><br>
</td>
</tr>
<tr>
<td>3</td>
<td>
<a href="#objdef-id1856X4432">MX_servers</a><br>
</td>
<td>
<a href="#objdef-id1854X4432">Domain_controllers</a><br>
</td>
<td>
<a href="#objdef-id4885X4432">Tcp 139</a><br>
</td>
<td>Accept</td>
<td><br>
</td>
</tr>
#</html>
EOHTML
#my $parser = MyParser->new;
$parser->parse( $html ); |
Partager