Convert PDF Documents to Word or Rich Text Format
Saving documents as PDFs has become trivially easy. A huge number of PDF creator apps have emerged, most of them free, and almost all of them quite simple to use. Programs like Word 2007 and OpenOffice.org have “save as PDF” built in (you need an add-on from Microsoft to do this in Word 2007, but it’s part of the normal interface once you install the add-on). Adobe’s Acrobat.com lets you save to PDF from their word processor, Buzzword, and includes a PDF converter that will transform any document you upload to PDF.
What if you want to go the other way, though? That is, what if you want to get the text back out of a PDF so you can edit it in your normal word processor? This is quite a bit harder than creating a PDF — strange things happen to the original text when you create a PDF that make it quite difficult to pull the text and, especially, the formatting out.
Enter PDFtoWord, a free web-based service that has just begun offering its services publicly. PDFtoWord is simple — you select a PDF file on your harddrive, select whether you want the output to be a Word (.doc) file or a Rich Text Format (.rtf) file, enter your email address, and click “convert”. Within an hour or so (like I said, this kind of conversion is difficult!), PDFtoWord emails you the output of the process — a very nicely formatted and ready-to-edit word processor file.
I tried it with a copy of my e-book for students, Don’t Be Stupid, a complexly formatted document of about 80 pages, laid out into a dozen chapters and a few appendices. PDFtoWord preserved the pagination, the chapter breaks, the text formatting (though not the styles used), and every line of white space — the document I got back looked remarkably similar to the document I’d sent, far exceeding my expectations. The missing elements are things I couldn’t imagine there being a way to preserve, like the styles — I don’t know how the program could guess that all large bold text aligned right should be “Heading 3”.
So what I’m saying is that as a free service, PDFtoWord performs admirably — even better than some paid programs I’ve tried. PDFtoWord is offered by NitroPDF, which makes several other free, Web-based PDF utilities for creating and even editing PDFs, in addition to their desktop-based paid program NItroPDF Professional, which aims to be a sort of “Acrobat Lite” for creating, manipulating, editing, and combining PDF files.
PDFtoWord (free)
Hey, Great site.
I wasn’t sure if you were aware that the latest edition of Adobe Reader does actually allow you to save to .txt. I realise this isn’t technically the same as a .doc, but then you can always cut and paste!
I believe this is a relatively new feature to PDF because before I saw your post I had no idea it was even posible to do. Which is why I was generally quite excited when I read your title (sad I know)
Daniel: It’s always been somewhat easy to get text from a PDF — for one thing, as long as the text is there (that is, it’s not a scanned image only) you could cut and paste. The problem is getting something that has similar formatting and layout to your original. For instance, many academic writers do new editions of text books and their publishers want them to start from the published versions (typically they only want to change 1/3 or less of the pages so that they don’t have to re-edit the whole book). The publisher might send the PDF proofs from the last edition — but what’s an author to do with those? Export just the text, you lose all the headings, italics, boldface (e.g. vocab words), etc. and have to recreate it all over again.
I am using Tweak pdf to word,another pdf converter.
it only cost a little but performs good.
you can try it in http://www.tweakpdf.com/ if you need a converter too.
I found an alternative that might do a better job. It’s a program called pdf2html. The sourceforge site is: https://sourceforge.net/projects/pdftohtml/. It is well worth looking at, but only if you’re not afraid of the command line.
Happy hunting!
There is another alternative available Saaspose.PDF their site is : http://saaspose.com/api/pdf that can convert your PDF document to WOrd and to many other documents and because its a cloud API you don’t have to download or install it just sign up on the site upload your document and convert it into the format you want.