Lucene Position dans le document

**Raphael94** · 19/04/2007, 19h39

Je débute sur l'API Lucene. Je cherche à afficher, apres la requete sur un mot, le contexte du document ou se trouve le mot. Autrement dit je cherche à faire un affichage classique à la mode Google, avec un extrait du document ou se trouve le mot.
Pour ca j'ai repris les démos livrées avec Lucene:
Pour l'indexation j'ai ajoutté "Field.TermVector.WITH_POSITIONS_OFFSETS" comme cela:

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
   Document doc = new Document();

    // Add the path of the file as a field named "path".  Use a field that is 
    // indexed (i.e. searchable), but don't tokenize the field into words.
    doc.add(new Field("path", f.getPath(), Field.Store.YES, Field.Index.UN_TOKENIZED));

    // Add the last modified date of the file a field named "modified".  Use 
    // a field that is indexed (i.e. searchable), but don't tokenize the field
    // into words.
    doc.add(new Field("modified",
        DateTools.timeToString(f.lastModified(), DateTools.Resolution.MINUTE),
        Field.Store.YES, Field.Index.UN_TOKENIZED));
   // Add the contents of the file to a field named "contents".  Specify a Reader,
    // so that the text of the file is tokenized and indexed, but not stored.
    // Note that FileReader expects the file to be in the system's default encoding.
    // If that's not the case searching for special characters will fail.
    doc.add(new Field("contents", new FileReader(f),Field.TermVector.WITH_POSITIONS_OFFSETS));

Ca c'est pour l'indexation.
Mais au moment de la recherche, je n'arrive pas à retrouver le TermVector qui est censé contenir l'offset de position du mot cherché.
Ou alors j'ai rien compris (l'API n'est pas si simple)... et il faut faire autrement.
Ps: j'ai déjà lu ici http://gfx.developpez.com/tutoriel/java/lucene/#L1 le tutoriel sur Lucene qui est d'ailleurs une bonne introduction.
Merci d'avance.

**Raphael94** · 23/04/2007, 11h12

J'ai finalement fait la méthode "bourrin" :
Rechercher par moi meme le mots clef dans la liste des fichiers "hits" trouvés par Lucene.

Lucene Position dans le document

API standards et tierces Java

Discussions similaires

Partager

Partager