Bonjour,

J'ai conçu un crawler qui fonctionne avec plein de site y compris en HTTPS mais j'ai un souci avec ce site en particulier https://lingeriematterhorn.fr. Il semble y avoir un souci dans son protocole SSL (auquel je n'ai pas accès) et cela me retourne des erreurs :

Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
javax.net.ssl.SSLException: Received fatal alert: internal_error
	at sun.security.ssl.Alerts.getSSLException(Unknown Source)
	at sun.security.ssl.Alerts.getSSLException(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.recvAlert(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
	at sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown Source)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(Unknown Source)
	at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563)
	at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:587)
	at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540)
	at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227)
	at main.SimpleCrawler.getContent(SimpleCrawler.java:482)
Pourtant j'accepte tous les certificats (je n'ai pas de souci de sécurité) :
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
//valider tous les certificats SSL
			// Create a trust manager that does not validate certificate chains
			TrustManager[] trustAllCerts = new TrustManager[] { new X509TrustManager() {
			    public java.security.cert.X509Certificate[] getAcceptedIssuers() {
			        return null;
			    }
 
			    public void checkClientTrusted(java.security.cert.X509Certificate[] certs, String authType) {
			    }
 
			    public void checkServerTrusted(java.security.cert.X509Certificate[] certs, String authType) {
			    }
 
			} };
			// Install the all-trusting trust manager
			try {
			    SSLContext sc = SSLContext.getInstance("SSL");
			    sc.init(null, trustAllCerts, new java.security.SecureRandom());
			    HttpsURLConnection.setDefaultSSLSocketFactory(sc.getSocketFactory());
			} catch (Exception e) {
			}
			// Set at true the HostnameVerifier
			HostnameVerifier hv = new HostnameVerifier() {
			    public boolean verify(String urlHostName, SSLSession session) {
			        // System.out.println("Warning: URL Host: "+urlHostName+"
			        // vs. "+session.getPeerHost());
			        return true;
			    }
			};
			HttpsURLConnection.setDefaultHostnameVerifier(hv);
J'ai essayé en ajoutant les options JVM -Djsse.enableSNIExtension=false, -Dhttps.protocols=TLSv1 -Djdk.tls.client.protocols=TLSv1, -Dcom.sun.net.ssl.enableECC=false mais cela continue

Pour info lorsque je met -Djavax.net.debug=all -Dcom.sun.net.ssl.enableECC=false, j'ai ceci :
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Allow unsafe renegotiation: false
Allow legacy hello messages: true
Is initial handshake: true
Is secure renegotiation: false
Thread-4, setSoTimeout(300000) called
Ignoring unsupported cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 for TLSv1
Ignoring unsupported cipher suite: TLS_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 for TLSv1.1
Ignoring unsupported cipher suite: TLS_DHE_DSS_WITH_AES_128_CBC_SHA256 for TLSv1.1
%% No cached client session
*** ClientHello, TLSv1.2
RandomCookie:  GMT: 1469901999 bytes = { 187, 123, 81, 231, 56, 0, 220, 78, 90, 63, 36, 16, 9, 37, 217, 216, 227, 67, 19, 24, 121, 79, 132, 26, 18, 197, 70, 43 }
Session ID:  {}
Cipher Suites: [TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_DSS_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256, SSL_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA, TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
Compression Methods:  { 0 }
Extension signature_algorithms, signature_algorithms: SHA512withECDSA, SHA512withRSA, SHA384withECDSA, SHA384withRSA, SHA256withECDSA, SHA256withRSA, SHA256withDSA, SHA1withECDSA, SHA1withRSA, SHA1withDSA
***
[write] MD5 and SHA1 hashes:  len = 97
0000: 01 00 00 5D 03 03 58 9D   ED AF BB 7B 51 E7 38 00  ...]..X.....Q.8.
0010: DC 4E 5A 3F 24 10 09 25   D9 D8 E3 43 13 18 79 4F  .NZ?$..%...C..yO
0020: 84 1A 12 C5 46 2B 00 00   1A 00 3C 00 67 00 40 00  ....F+....<.g.@.
0030: 2F 00 33 00 32 00 9C 00   9E 00 A2 00 0A 00 16 00  /.3.2...........
0040: 13 00 FF 01 00 00 1A 00   0D 00 16 00 14 06 03 06  ................
0050: 01 05 03 05 01 04 03 04   01 04 02 02 03 02 01 02  ................
0060: 02                                                 .
Thread-4, WRITE: TLSv1.2 Handshake, length = 97
[Raw write]: length = 102
0000: 16 03 03 00 61 01 00 00   5D 03 03 58 9D ED AF BB  ....a...]..X....
0010: 7B 51 E7 38 00 DC 4E 5A   3F 24 10 09 25 D9 D8 E3  .Q.8..NZ?$..%...
0020: 43 13 18 79 4F 84 1A 12   C5 46 2B 00 00 1A 00 3C  C..yO....F+....<
0030: 00 67 00 40 00 2F 00 33   00 32 00 9C 00 9E 00 A2  .g.@./.3.2......
0040: 00 0A 00 16 00 13 00 FF   01 00 00 1A 00 0D 00 16  ................
0050: 00 14 06 03 06 01 05 03   05 01 04 03 04 01 04 02  ................
0060: 02 03 02 01 02 02                                  ......
[Raw read]: length = 5
0000: 15 03 03 00 02                                     .....
[Raw read]: length = 2
0000: 02 50                                              .P
Thread-4, READ: TLSv1.2 Alert, length = 2
Thread-4, RECV TLSv1.2 ALERT:  fatal, internal_error
Thread-4, called closeSocket()
Thread-4, handling exception: javax.net.ssl.SSLException: Received fatal alert: internal_error

Je vous mets mon code pour que vous puissiez reproduire le souci, mais le souci n'est pas à chercher là dedans je pense :
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
BufferedReader reader = null;
  HttpURLConnection conn = null;
  String text = new String();
  StringBuilder sb = new StringBuilder();
 
      //CookieHandler.setDefault(new CookieManager(null, CookiePolicy.ACCEPT_ALL));
 
      URL url2 = new URL("https://lingeriematterhorn.fr/chemise_de_nuit_no_74757_prod_id-74757.htm");
      conn = (HttpURLConnection) url2.openConnection();
 
 
      conn.setRequestProperty( "Host", url2.getHost() );
      conn.setRequestProperty( "Accept", "text/javascript, text/html, application/xml, text/xml, */*" );//text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
      conn.setRequestProperty( "Accept-Language", "fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3" );
      conn.setRequestProperty( "Connection", "keep-alive" );
      //conn.setRequestProperty( "X-Requested-With", "XMLHttpRequest" );
      //conn.setRequestProperty( "X-Prototype-Version", "1.7" );
      conn.setInstanceFollowRedirects(false);
      //conn.setDoOutput(true); //mode POST
 
 
 
      //lecture de la réponse
      reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), Charset.defaultCharset().newDecoder() )); //, StandardCharsets.UTF_8
 
	  char[] cbuf = new char[8192];
	  int len;
	  while ( (len = reader.read(cbuf)) >= 0 ) {
			sb.append(cbuf, 0, len);
		}
	  text = sb.toString();