Bonjour,

je suis entrain de faire une API en java qui perment d'extraire les données spécifique depuis un document .PDF envoyé par l'utilisateur.

D'aprés la recherche sur l'internet , j'ai trouvé qu'il est difficile de le faire.

J'ai la structure de text suivant qui'est fixe:
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
2.1 Create Separate Partition for /tmp (Scored)
Profile Applicability:
Level 1
Description:
The /tmp directory is a world-writable directory used for temporary storage by all users and some applications.
Rationale:
Since the /tmp directory is intended to be world-writable, there is a risk of resource exhaustion if it is not bound to a separate partition. In addition, making /tmp its own file system allows an administrator to set the noexec option on the mount.
 
Remediation:
For new installations, during installation create a custom partition setup and specify a separate partition for /tmp.
References:
1. AJ Lewis, "LVM HOWTO",
2.2 Set nodev option for /tmp Partition (Scored)
Profile Applicability:
 Level 1
Description:
The nodev mount option specifies that the filesystem cannot contain special devices.
Rationale:
Since the /tmp filesystem is not intended to support devices, set this option to ensure that users cannot attempt to create block or character special devices in /tmp.
Remediation:
Edit the /etc/fstab file and add nodev to the fourth field (mounting options). See the fstab(5) manual page for more information.
2.3 Add nodev Option to Removable Media Partitions (Not Scored)
Profile Applicability:
Level 1
Description:
Set nodev on removable media to prevent character and block special devices that are present on the removable media from being treated as device files.
Rationale:
Removable media containing character and block special devices could be used to circumvent security controls by allowing non-root users to access sensitive device files such as /dev/kmem or the raw disk partitions.
Remediation:
Edit the /etc/fstab file and add "nodev" to the fourth field (mounting options). 
2.4 Add nosuid Option to /run/shm Partition (Scored)
Profile Applicability:
 Level 1
Description:
The nosuid mount option specifies that the /run/shm (temporary filesystem stored in memory) will not execute setuid and setgid on executable programs as such, 
Rationale:
Setting this option on a file system prevents users from introducing privileged programs onto the system and allowing non-root users to execute them.
Remediation:
Edit the /etc/fstab file and add nosuid to the fourth field (mounting options).
Donc, je cherche a récupérer le contenu de tous sections qui contient le mot "( Scored)", par exemple ce texte:
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
2.1 Create Separate Partition for /tmp (Scored)
Profile Applicability:
Level 1
Description:
The /tmp directory is a world-writable directory used for temporary storage by all users and some applications.
Rationale:
Since the /tmp directory is intended to be world-writable, there is a risk of resource exhaustion if it is not bound to a separate partition. In addition, making /tmp its own file system allows an administrator to set the noexec option on the mount.

Remediation:
For new installations, during installation create a custom partition setup and specify a separate partition for /tmp.
References:
1. AJ Lewis, "LVM HOWTO",
2.2 Set nodev option for /tmp Partition (Scored)
Profile Applicability:
 Level 1
Description:
The nodev mount option specifies that the filesystem cannot contain special devices.
Rationale:
Since the /tmp filesystem is not intended to support devices, set this option to ensure that users cannot attempt to create block or character special devices in /tmp.
Remediation:
Edit the /etc/fstab file and add nodev to the fourth field (mounting options). See the fstab(5) manual page for more information.
2.4 Add nosuid Option to /run/shm Partition (Scored)
Profile Applicability:
 Level 1
Description:
The nosuid mount option specifies that the /run/shm (temporary filesystem stored in memory) will not execute setuid and setgid on executable programs as such, 
Rationale:
Setting this option on a file system prevents users from introducing privileged programs onto the system and allowing non-root users to execute them.
Remediation:
Edit the /etc/fstab file and add nosuid to the fourth field (mounting options).

The idea is to browse each line to find the word "Scored" if it exists and not. Then take their position to split the text using the substring.
L'idée est de parcourir chaque ligne pour chercher le mot "(Scored)" s'il existe et non. Puis en va prendre leur position pour découper le texte en utilisant le substring.

Voici mon code:
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
 
public Boolean ExtractPDF() {
 
 String text = "2.1 Create Separate Partition for /tmp (Scored)
Profile Applicability:
Level 1
Description:
The /tmp directory is a world-writable directory used for temporary storage by all users and some applications.
Rationale:
Since the /tmp directory is intended to be world-writable, there is a risk of resource exhaustion if it is not bound to a separate partition. In addition, making /tmp its own file system allows an administrator to set the noexec option on the mount.
 
Remediation:
For new installations, during installation create a custom partition setup and specify a separate partition for /tmp.
References:
1. AJ Lewis, "LVM HOWTO",
2.2 Set nodev option for /tmp Partition (Scored)
Profile Applicability:
 Level 1
Description:
The nodev mount option specifies that the filesystem cannot contain special devices.
Rationale:
Since the /tmp filesystem is not intended to support devices, set this option to ensure that users cannot attempt to create block or character special devices in /tmp.
Remediation:
Edit the /etc/fstab file and add nodev to the fourth field (mounting options). See the fstab(5) manual page for more information.
2.3 Add nodev Option to Removable Media Partitions (Not Scored)
Profile Applicability:
Level 1
Description:
Set nodev on removable media to prevent character and block special devices that are present on the removable media from being treated as device files.
Rationale:
Removable media containing character and block special devices could be used to circumvent security controls by allowing non-root users to access sensitive device files such as /dev/kmem or the raw disk partitions.
Remediation:
Edit the /etc/fstab file and add "nodev" to the fourth field (mounting options). 
2.4 Add nosuid Option to /run/shm Partition (Scored)
Profile Applicability:
 Level 1
Description:
The nosuid mount option specifies that the /run/shm (temporary filesystem stored in memory) will not execute setuid and setgid on executable programs as such, 
Rationale:
Setting this option on a file system prevents users from introducing privileged programs onto the system and allowing non-root users to execute them.
Remediation:
Edit the /etc/fstab file and add nosuid to the fourth field (mounting options).";
 
String wordToFind = " (Scored)"; 
Pattern word = Pattern.compile(wordToFind);
Matcher match = word.matcher(text);
 
for(int i=0; i<= text.length(); i++){
 
    while (match.find() && text.toLowerCase().contains(wordToFind.toLowerCase())==true ) {
                       // System.out.println("Found love at index "+ match.start() +" - "+ (match.end()-1));
                       String res = text.substring(match.start()+1 , match.end());
                       System.out.println("****"+ res); // ****Scored
 
    }
 }
}
Mais ce code il affiche ceci "****Scored"

je suis débutant en java et j'en sais pas comment récupérer le bon texte comme j'ai défini en début de sujet.

Merci de m'aider mes amis.