Index a Book Using Word and Excel
I recently published an academic book (Anthropology at the Dawn of the Cold War: The Influence of Foundations, McCarthyism and the CIA, since you asked) and one of the tasks I was responsible for was creating an index for my book. Yes, I could have asked them to send it out to a professional indexer, but that would come out of my royalties — maybe take up all my royalties. Besides, I figured, how hard could it be?
Turns out, very hard. Indexing is not a simple exercise in any way; each entry has to be thoughtful and necessary, the best way to find a specific piece of information. You have to imagine who might use your book and what kind of information they might seek, and then predict how they might seek that information. You have to weigh every keyword — every name, theory, book title, event, place, organization, etc. — to decide whether its use in the text is significant enough to direct people to it. Like I said, it’s hard work, and much more an art than a science. (Incidentally, indexes are copyrighted works, which reflects their status as an original expression of thought.)
In the end, I did the index, and I think I did a pretty good job of it. I started by using index cards (that is their name, after all) but that got old really fast, so I developed my own system using a notepad, Excel, and Word. Here’s how:
- Go through each chapter of proofs, writing down each word you feelcould be in the index, followed by the page(s) on which it appeared. Use a pen and paper for this, and allow plenty of latitude on whether or not a term should end up in the index –you’ll winnow later.
- After an initial pass through the book, make a second pass to catch any terms you don’t decide to include until a later chapter.
- One chapter at a time, copy the word lists into Excel. One column for each keyword, and another for the page numbers (multiple instances separated by paragraphs). For subheadings, put the main heading, followed by a dash, followed by the subheading, like this:
Steward, Julian — as Columbia professor
- After each page is entered, sort on the keyword column. All the subheadings sort together because they share the same first part (the heading).
- As you enter each page, check to see if there is already an entry for each term and add the page numbers to that, or add a new entry at the bottom if this is the first time the term appears. If you make a mistake and add a duplicate, no big deal, because when you sort, the duplicates will end up next to each other and can be easily identified and combined.
- Enter a page, sort, enter a page, sort, and on and on until done.
- Check to see if there are any duplicate entries and combine them: cut and paste the page numbers from one into the other’s entry, and delete the now-empty row.
- Now, edit. For any entry that has more than 5 or 6 page references, consider adding sub-headings. For any entry that has only one or two page references, check to make sure the mention is significant. Review each entry and decide whether it is the best way to find the information it points to. Add cross-references (“Anti-communism, see McCarthyism”; “see also Columbia University”). This is real editing of real writing– you have to be sure that every word adds to the value of the piece, just as you would if this were a novel, short story, or essay.
- Once all the entries are in order and you’re satisfied that your index is both thorough and accurate, copy and paste the two columns into Word. Use “Paste Special” to paste as unformatted text (otherwise it will paste as a table).
- Clean up the formatting, adjust the text size and font, make everything look nice, and you’re done.
This is fascinating–I work as a freelance writer and editor, but did not realise that you could use Excel to create an index. Thank you!
Devaki: Neither did I! Until I had to figure out how to do it.
The funny thing is, before I was born my mother was a professional indexer. SO I figured, “well, I’ll just ask her how to do it”. Big mistake. First of all, what happened 40-odd years ago wasn’t exactly fresh in her mind. Second of all, index cards were high-tech back then, so that’s what they used (which is, I guess, why they’re called *index* cards). Go through the book, make a note of each instance of a usage on an index card; when done, arrange the cards and type up your index.
The thought of doing that makes me cry.
Then there’s “indexing” software, which is kind of expensive and doesn’t really make an index. It makes what is called (if I remember right) a “concordance”, which is simply a list of where every word in the text appears. You could conceivably do this, trim out all the unnecessary words, and have something *like* an index — but it won’t be very useful.
Consider this website, for example. If there were a word index for this site, you’d want the entry for “index” to point to this post, and not to every post in which I might have mentioned “so I checked the index” or “so I wrote it on an index card” or “the time you work is an index of how much you get done” (Which I don’t believe! I’m just making stuff up here!) or whatever. That is, you want the index to take you to the posts/pages that really discuss the topic you’re interested in, not every single page that mentions a word in passing.
So this is what I came up with. Hopefully it helps someone else out. Or someone has a better way of doing it that they’ll share with us.
> Or someone has a better way of doing it
> that they’ll share with us.
Very useful information. Thanks so much for sharing it. I do freelance editing but have shied away from indexing as too hard, but your system looks managable. One question: how much time do you think it takes to index, say, a 100,000 word book? Or does it depend on the detail in the index? This would be useful to know for quoting prices in advance. (I imagine it’s time-consuming.)
Fran: I didn’t really keep track at the time — I’d say it was 3 or 4 long, intense days of work, and a couple of extra hours hre and there. I did the initial collection of terms as I was proofing the proofs, so I suppose that saves some time from your life as a whole, although you’re still scouring proofs of a book you’ve already read a half-dozen times and are pretty much tired of.
There are several good articles about indexing, controlled vocabularies, and faceted classification at the Boxes and Arrows website.
In particular, you may find the Improving Usability with a Website Index article (and its references) very useful.
You may also find these articles interesting:
The ABCs of the BBC: A Case Study and Checklist The “For More Information” section at the end is excellent.
All About Facets & Controlled Vocabularies (Deals with see also and such.)
Thank you so much for sharing your system. I’m a graduate student and got a job indexing a book – I’d never done it before. I used your system and it worked wonderfully. You really saved me hours of trial and error.
Just a quick question – how long would you estimate this process took you? (i.e. how many hours for, say, 50 pages of text?)
Michelle: It was quite a while ago now, so I can’t remember exactly, but it was probably about 20 hours all together for a 200-page proof.
its only 700 i did and it took me 72 hours