Is it possible to read Word files (.doc/.docx) in Python -


i want create validation tool;

can 1 me read .doc/.docx documents in python in order search , compare file contents.

yes possible. libreoffice (at least) has command line option convert files works treat. use convert file text. load text file python per routine manoeuvres.

this worked me on libreoffice 4.2 / linux:

soffice --headless --convert-to txt:text /path_to/document_to_convert.doc 


i've tried few methods (including odt2txt, antiword, zipfile, lpod, uno). above soffice command first worked , without error. this question on using filters soffice on ask.libreoffice.org helped me.


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -