While tag cloud generators are all the rage for visually analyzing the text content of various popular Web sites and documents, I decided to go back to an old-fashioned "keyword density"
analyzer tool to take a look at Obama's and McCain's recent party nomination acceptance speeches.
Keyword density tools have been used by search optimizers for many years to
determine the keyword frequency and weight of words and phrases on a Web page. The more popular tools rate the frequency of one-, two-, and three-word phrases throughout a document, showing the
overall number of times presented, as well as the relevant percentage of mentions throughout a document. This type of analysis tends to be more useful to SEOs for two- and three-word phrases, but for
this analysis, the tool will also shed a light on single-word themes to illustrate the overall tone of the candidates' acceptance speeches.
On a side note, I personally don't care
much for the phrase "keyword density," because these tools typically don't weight keywords in a document -- or in other words, they don't take semantic markup elements into
account -- they score on the frequency of words in a document. Even if a particular tool does provide weighted analysis on keywords, it's an educated guess at best, because there are too
many variables used by engines to determine actual weight, and the engines also aren't telling how much they weight these elements. So to clarify, this analysis focuses on the frequency
of words in the speeches.
advertisement
advertisement
One other interesting finding from the keyword frequency tool was that the word scores were sometimes at odds with the tag cloud analysis on the speeches reported in
various blogs and online media. The inference from the tag cloud was that larger words were spoken more frequently, and had more importance. The keyword frequency analysis tool showed different
results in some important areas. McCain's use of "fight," for example, wasn't weighted as much in many tag clouds, and appeared to be less important, though it was the second most
frequently used word in the entire speech.
A stop-word list was used, so common words such as "the," "and," "would," and "until" were stripped from the
overall speeches for analysis. Keep in mind that frequency percentages are relative to the shortened versions of the documents, and not the full-length speeches.
Of course, this analysis
takes a very literal view of the speeches, and doesn't imply overall meaning to what was said (though it points in the right direction in many ways). But it is fun to review and analyze, and
there are some interesting linguistic leanings nonetheless.
Here is what I found:
- Obama used the single word "promise" more than any other word (31 times, or 1.61%
frequency).
- Obama's most used two-word phrase was "John McCain" (15 times, 1% frequency). He also used his opponent's last name, "McCain," 21 times
throughout the speech -- the third most popular word in his speech.
- Variations of "America" were obviously popular with both candidates. Obama used "America" 26
times (1.35%), "American" 18 times (0.93%), and "Americans" eight times (0.41%), for a total of 52 uses of variations on "America." McCain used "Americans" 17
times (0.92%), "American" nine times (0.49%), and "America" five times (0.27%), for a total of 30 references.
- McCain used "country" more than any other word,
for a total of 30 times. His most frequent two-word phrase was "Senator Obama," at six times, or 0.46% of the speech. The number-two word in the entire McCain speech was "fight"
(22 times, or 1.19%).
- "Going" turned out to be McCain's version of Obama's "promise," and was used 16 times, or 0.87% frequency. - In addition to
Obama's main theme of "promise" to Americans, he also used the word "change" 15 times. McCain managed to emphasize the Obama theme as well, mentioning the word
"change" nine times.
- On the issues front, Obama's top word was "economy," mentioned 10 times (0.52%). McCain's main issue was "jobs," mentioned nine
times in the speech (0.49%), followed by "economy" (0.38%), mentioned seven times.
While this is a very high-level view of the most popular words in the speech, the overall tone can
also be expressed by checking out the longer list of terms that were not used as often.
Obama's less frequent, but impactful terms included the following: "College, veterans,
strength, businesses, protect, progress, nuclear, dreams, life, energy, hope, poverty, whiners, threats, power, troops, children, security, sister, believe, journey, debate" and
"patriotism." McCain's less frequent but impactful words included: "Trust, history, power, tough, peace, loved, business, attacked, blessed, opportunity, education, future, taxes,
experience, family" and "prosperity."
So here are the words of two men who want to be the U.S. president. If you have any thoughts on this analysis, or the speeches
themselves, I would be interested to read your ideas in the comment section of this post at MediaPost.