Moving to Linux: Working with Text (Part 2)
In the last installment, we examined how to compose and check your writing using the Linux tools txt2tags and aspell. Let’s assume that you’ve used these tools now–used them quite a lot. You now have several directories and sub-directories filled with dozens of text files. How to organize all of this text?
Don’t worry, Linux has you covered.
Searching
Unix users have become very adept at managing files in plain text format that might be stored all over a computer (including extremely large server systems). Historically, unlike systems such as Windows that often store system configuration data in proprietary (read: bizarre, and all-but-incomprehensible to even veteran users) format, most Unix configuration files are kept as plain text. So, in order for Unix admins to find these configurations over an entire system, many utilities to search text were created.
Linux includes these, and you can use them not only to find and manage system configs, but anything stored in plain text (such as HTML files). The Linux utility for this is called grep
. Please don’t ask me what the word means, just look here. Searching with it is as easy as entering a command like the one below:
grep -R yoursearchword *
This command will search all files in the directory you are currently in, and all those below (and the one below those, etc…) for the term “yoursearchword.” The “-R” flag stands for “recursive,” which tells grep
to go into all downward directories. The “*” acts as you might expect, and searches all files; the command could have just as easily been “*.txt* to search text files or “*.html” to search web page files. The command will give you a result like the one below, listing all the files that contained the search term:
Now that we’ve found what we’re looking for, let’s say it was an old version of a draft. You might find it useful to compare it to what you have currently, say, to save that one genius piece of prose that you’d forgotten to copy over.
Comparing
There is also a Unix utility for comparing two text files for changes between them. diff
looks at two files, line by line, and identifies anywhere that the two don’t match up. It will output a list of files and line numbers where the files don’t match up. So, if you know there’s a paragraph that’s missing from your current file, and it was in a different file, then you’ll be able to find it. This may not seem like a benefit if you’re comparing two files: after all, it might be easier to just look at them. Maybe… but it will probably take you more than .7 seconds, which is the length of time a two-file compare would take. You could do so with the following command:
diff yourtextfile1.txt yourtextfile2.t2t
Note the difference extensions–diff
doesn’t care what the files are called, as long as they’re text. So you could compare a “.txt” file to an “.html” file. Just be prepared for a lot of results.
But what if you’re genius paragraph is buried somewhere in your several-levels-deep directory tree with dozens of files. diff
will allow you to set a “base” file to compare all other files against. Consider the following:
diff -r -y --to-file="yourbasefile.txt" .
Here, the “-r” flag works the same as above, telling the program to recurse lower-level directories, and “-y” tells it to give you a side-by-side listing (you’ll want this format at first). The “–to-file=” flag instructs diff
to compare all files it finds, one file at a time, against the file “yourbasefile.txt.” Finally, the trailing “.” is Unix’s (and DOS’, if you remember) abbreviation for your current directory. So, the above command will go through the current directory, and all those below it, and compare each file it finds to “yourbasefile.txt.” Sounds slightly more useful? I agree.
With the above two programs, in addition to the drafting tools we discussed last time, gives most writers everything they would need to draft their ideas. But once your ideas are on-screen, what then? Most writers will need to add formatting (or confirm, since we’ve done that already with txt2tags
), add things like tables of contents, indeces, and possibly collaborate with others on authoring. There are certainly ways to accomplish these using plain-text tools, but for many, using other programs will be more convenient. In the next installment, we’ll look at the latest version of the king of Linux word processors, OpenOffice.org, and see what’s new for writers.
I’ve been writing in plain text files for a few months now. One reason I chose to do so is grep. I can find anything I’ve written instantaneously. It’s a great way to keep my perfectionist mind at ease.