| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 
 |  
HTTP Authentication
 
Many web sites restrict access to documents by using ``HTTP Authentication''. This isn't just any form of ``enter your password'' restriction, but is a specific mechanism where the HTTP server sends the browser an HTTP code that says ``That document is part of a protected 'realm', and you can access it only if you re-request it and add some special authorization headers to your request''.
 
For example, the Unicode.org admins stop email-harvesting bots from harvesting the contents of their mailing list archives, by protecting them with HTTP Authentication, and then publicly stating the username and password (at http://www.unicode.org/mail-arch/) -- namely username ``unicode-ml'' and password ``unicode''.
 
For example, consider this URL, which is part of the protected area of the web site:
 
  http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html
 
If you access that with a browser, you'll get a prompt like ``Enter username and password for 'Unicode-MailList-Archives' at server 'www.unicode.org'''.
 
In LWP, if you just request that URL, like this:
 
  use LWP;
  my $browser = LWP::UserAgent->new;
 
  my $url =
   'http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html';
  my $response = $browser->get($url);
 
  die "Error: ", $response->header('WWW-Authenticate') || 'Error accessing',
    #  ('WWW-Authenticate' is the realm-name)
    "\n ", $response->status_line, "\n at $url\n Aborting"
   unless $response->is_success;
 
Then you'll get this error:
 
  Error: Basic realm="Unicode-MailList-Archives"
   401 Authorization Required
   at http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html
   Aborting at auth1.pl line 9.  [or wherever]
 
...because the $browser doesn't know any the username and password for that realm (``Unicode-MailList-Archives'') at that host (``www.unicode.org''). The simplest way to let the browser know about this is to use the credentials method to let it know about a username and password that it can try using for that realm at that host. The syntax is:
 
  $browser->credentials(
    'servername:portnumber',
    'realm-name',
   'username' => 'password'
  );
 
In most cases, the port number is 80, the default TCP/IP port for HTTP; and you usually call the credentials method before you make any requests. For example:
 
  $browser->credentials(
    'reports.mybazouki.com:80',
    'web_server_usage_reports',
    'plinky' => 'banjo123'
  );
 
So if we add the following to the program above, right after the < $browser = LWP::UserAgent-new; >> line...
 
  $browser->credentials(  # add this to our $browser 's "key ring"
    'www.unicode.org:80',
    'Unicode-MailList-Archives',
    'unicode-ml' => 'unicode'
  );
 
...then when we run it, the request succeeds, instead of causing the die to be called. | 
Partager