Is it possible to read Word files (.doc/.docx) in Python -


i want create validation tool;

can 1 me read .doc/.docx documents in python in order search , compare file contents.

yes possible. libreoffice (at least) has command line option convert files works treat. use convert file text. load text file python per routine manoeuvres.

this worked me on libreoffice 4.2 / linux:

soffice --headless --convert-to txt:text /path_to/document_to_convert.doc 


i've tried few methods (including odt2txt, antiword, zipfile, lpod, uno). above soffice command first worked , without error. this question on using filters soffice on ask.libreoffice.org helped me.


Comments

Popular posts from this blog

php - cannot display multiple markers in google maps v3 from traceroute result -

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -