remplacer un texte par un autre

**Chris33** · 15/02/2025, 23h50

Bonsoir,

Voici un code dont le but est de remplacer un texte dans un fichier odt en utilisant la librairie pyodf :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def saxiter(node : Element) -> Iterator[Element]:
    """Return an interator over all elements reachable from node: later siblings and recursively all children."""
    while node:
        yield node
        if node.hasChildNodes():
            yield from saxiter(node.firstChild)
        node = node.nextSibling
 
def edittextElements(doc : OpenDocument, pattern : list[str]) -> Generator[tuple[str, str], str, None]:
    """Goes over all elements, and look for the text that contains the given."""
    for elem in saxiter(doc.topnode):
        if elem.__class__ is Text:
            for pat in pattern:
                if pat in str(elem):
                    elem.data = yield (pat, elem.data)

J'ai trouvé ce code sur cette page : https://github.com/eea/odfpy/wiki/Re...eTextToAnother.

Avez-vous eu l'occasion de le tester ? Personnellement je ne le comprends et je ne vois pas comment le compléter pour le rendre opérationnel.

**jurassic pork** · 16/02/2025, 09h15

Hello,
as-tu LibreOffice ou OpenOffice installé sur ton ordinateur ?
Quel est ton O.S ?
Ami calmant, J.P

**Chris33** · 16/02/2025, 11h21

Bonjour Jurassic,

Oui, j'ai Libre Office et je suis sur Windows 11.

**jurassic pork** · 16/02/2025, 12h30

Et pourquoi fais-tu cela avec un python externe alors que dans LibreOffice , il y a un python et on peut exécuter du code python dans des macros qui manipule les documents LibreOffice?

**Chris33** · 16/02/2025, 13h07

Le but est de manipuler des fichiers odt directement depuis mes propres scripts Python et sans passer par libre Office.

**jurassic pork** · 16/02/2025, 14h16

J'ai trouvé un code qui utilise pyodf qui semble pas trop mal:

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from odf.opendocument import load
from odf import text, teletype
 
textdoc = load("D:/dev/LibreOffice/Questions2.odt")
texts = textdoc.getElementsByType(text.P)
s = len(texts)
for i in range(s):
    old_text = teletype.extractText(texts[i])
    new_text = old_text.replace('Produits', 'Produit')
    new_S = text.P()
    new_S.setAttribute("stylename", texts[i].getAttribute("stylename"))
    new_S.addText(new_text)
    texts[i].parentNode.insertBefore(new_S, texts[i])
    texts[i].parentNode.removeChild(texts[i])
textdoc.save('D:/temp/ficResultat.odt')

Le souci c'est qu'avec le fichier Odt que j'ai en test, texts est en début de boucle composé de 28 éléments mais au fur et à mesure qu'avance la boucle le nombre d'éléments dans texts diminue si bien qu'au bout d'un moment on a une erreur list index out of range.

**fred1599** · 16/02/2025, 17h00

Hello,

Une proposition,

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 
doc = load('test.odt')
 
patterns = ["Replace this"]
gen = edittextElements(doc, patterns)
 
try:
    while True:
        pat, old_text = next(gen)
        new_text = old_text.replace(pat, "to this")
        gen.send(new_text)
except StopIteration:
    pass
 
doc.save('result.odt')

**jurassic pork** · 16/02/2025, 17h46

comme on est sous windows une proposition en utilisant pywin32 et l'accès aux objets COM avec win32com

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import win32com.client
import subprocess
def PropertyValueArray(manager,num):
    '''Creates an openoffice property value array'''
    l = []
    for x in range(num):
        _p = manager.Bridge_GetStruct("com.sun.star.beans.PropertyValue")
        _p.Name = ''
        _p.Value = ''
        l.append(_p)
    return l
 
# Start LibreOffice in server mode
starter = ['C:/Program Files/LibreOffice/program/soffice.exe', '--accept=socket,host=localhost,port=2002;urp;', '--norestore', '--nologo', '--nodefault', '--headless']
proc = subprocess.Popen(starter)
# Connect to the running LibreOffice instance
objServiceManager = win32com.client.Dispatch("com.sun.star.ServiceManager")
objDesktop = objServiceManager.CreateInstance("com.sun.star.frame.Desktop")
objServiceManager._FlagAsMethod("Bridge_GetStruct")
objServiceManager._FlagAsMethod("Bridge_GetValueObject")
p = PropertyValueArray(objServiceManager,1)
p[0].Name = 'Hidden'  # doc should run hidden
p[0].Value = True  # doc should run hidden
document = objDesktop.loadComponentFromURL('file:///D:/dev/LibreOffice/Questions2.odt', "_blank", 0, p)
search = document.createSearchDescriptor()
search.SearchString = "Produits"
search.SearchAll = True
search.SearchWords = True
search.SearchCaseSensitive = False
selsFound = document.findAll(search)
if selsFound.getCount() == 0:
    proc.kill()
    exit
print(str(selsFound.getCount()) + " mots trouvés!")
for selIndex in range(0, selsFound.getCount()):
    selFound = selsFound.getByIndex(selIndex)
    selFound.setString("MesProduits")
# Save the document as ODT
document.storeAsURL("file:///D:/temp/document.odt", ())
# Close the document
document.close(True)
# Kill server
proc.kill()

1 - On lance LibreOffice en mode serveur headless
2 - On se connecte au service
3 - On ouvre le document source en mode Hidden
4 - On remplace un mot par un autre mot (on peut changer les options de recherche)
5 - On affiche le nombre de mots trouvés
6 - On enregistre les modifications dans un autre fichier odt
7 - On arrête le serveur LibreOffice

**Chris33** · 17/02/2025, 12h41

Merci Fred pour cette suggestion. par contre lorsque je teste :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from odf import opendocument, text, teletype
from odf.text import P
from odf.style import Style, TextProperties, ParagraphProperties
doc = opendocument.load(r"C:\Users\chris\Documents\mesScryptPython\fichiers odt\mon second document.odt")
 
 
def saxiter(Element1): #-> Iterator[Element]:
    """Return an interator over all elements reachable from node: later siblings and recursively all children."""
    while node:
        yield node
        if node.hasChildNodes():
            yield from saxiter(node.firstChild)
        node = node.nextSibling
 
def edittextElements(doc, pattern): #-> Generator[tuple[str, str], str, None]:
    """Goes over all elements, and look for the text that contains the given."""
    for elem in saxiter(doc.topnode):
        if elem.__class__ is Text:
            for pat in pattern:
                if pat in str(elem):
                    elem.data = yield (pat, elem.data)
 
patterns = ["code"]
gen = edittextElements(doc, patterns)
 
 
try:
    while True:
        pat, old_text = next(gen)
        new_text = old_text.replace(pat, "to this")
        gen.send(new_text)
        print(gen.send(new_text))
except StopIteration:
    pass
 
try:
    while True:
        pat, old_text = next(gen)
        new_text = old_text.replace('code', "to this")
        gen.send(new_text)
        doc.save(r"C:\Users\chris\Documents\mesScryptPython\fichiers odt\mon second document3.odt")
except StopIteration:
    pass

j'obtiens le message d'erreur suivant :
Traceback (most recent call last):
File "C:\Users\chris\Documents\mesScryptPython\fichiers odt\remplacer_texte4.py", line 39, in <module>
pat, old_text = next(gen)
^^^^^^^^^
File "C:\Users\chris\Documents\mesScryptPython\fichiers odt\remplacer_texte4.py", line 20, in edittextElements
for elem in saxiter(doc.topnode):
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\Documents\mesScryptPython\fichiers odt\remplacer_texte4.py", line 12, in saxiter
while node:
^^^^
UnboundLocalError: cannot access local variable 'node' where it is not associated with a value

**umfred** · 17/02/2025, 13h32

tu as mal recopié le code d'origine (source https://github.com/eea/odfpy/wiki/Re...eTextToAnother)

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def saxiter(node : Element) -> Iterator[Element]:
    """Return an interator over all elements reachable from node: later siblings and recursively all children."""
    while node:
        yield node
        if node.hasChildNodes():
            yield from saxiter(node.firstChild)
        node = node.nextSibling
 
def edittextElements(doc : OpenDocument, pattern : list[str]) -> Generator[tuple[str, str], str, None]:
    """Goes over all elements, and look for the text that contains the given."""
    for elem in saxiter(doc.topnode):
        if elem.__class__ is Text:
            for pat in pattern:
                if pat in str(elem):
                    elem.data = yield (pat, elem.data)

Ton paramètre element de la fonction saxiter devrait s'appeler node ou renommer les node en Element1 (nom de ton paramètre)

**Chris33** · 18/02/2025, 12h01

Oups...je rectifie et... ça fonctionne.
Un grand merci pour l'aide que vous m'avez apportée. Ce petit bout de code m'a bien fait avancé.

Peut-être encore une question :
Je vois dans ce code que c'est la ligne

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

elem.data = yield (pat, elem.data)

qui permet de donner directement le texte à l'élément alors que je partais sur une mauvaise piste du genre

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

element.parentNode.insertBefore(new_element, element)

.
Oui, lorsque je fais dir(elem) je retrouVe bien data dans les arguments mais de quelle manière peut-on avoir plus d'info ? help(elem) ou help(elem.data) ne donne rien.

remplacer un texte par un autre

Python

Vue hybride

Discussions similaires

Partager

Partager