From: Stig Agermose <firstname.lastname@example.org> Date: Tue, 02 Feb 1999 03:39:24 GMT Fwd Date: Tue, 02 Feb 1999 12:59:23 -0500 Subject: Open Letter: MJ-12 & Dr. Wood Source: "alt.ufo.reports". Stig *** From: "Sid Fiber" <dataVoid@hotmail.com> Newsgroups: alt.paranet.ufo,alt.ufo.reports Subject: Open Letter: MJ12 & Dr. Wood Date: Mon, 1 Feb 1999 16:49:58 -0800 (Background: This is an open letter to Dr. Bob Wood and his son Ryan Wood who are attempting to authenticate the validity of the MJ-12. These documents allegedly document the US government's efforts to conceal the discovery of extraterrestrial beings and crafts during the 1940s.) Dear Dr. Wood-- If Truman, Bush, Einstein, and others authored the Majestic 12 documents, then the methods of forensic-linguistics will conclusively prove it. Several years ago, Dr. Donald Foster of Vassar College devised a brilliant statistical method for determining the authorship of a document that he has since applied in dozens of court cases requiring verification of document authorship. In 1996, Foster successfully pegged Joe Klein as the anonymous author of Primary Color, the tell-all White House novel. Also, at the request of the FBI, Dr. Foster conclusively verified that Unabomber suspect Ted Kaczynski had indeed authored his invective manifesto. Foster first gained prominence by successfully applying his technique to solve an age old mystery surrounding the acting roles of Shakespeare. His breakthrough--hailed by Elizabethan scholars and computer scientists, alike--was to assume that even Shakespeare was prone to linguistic habits which revealed themselves through statistical comparison. Please read the following article for details. Clearly Dr. Wood, if you were to apply Dr. Foster's technique to the MJ-12 documents, you will be a giant step closer to conclusively verifying the authorship of these intriguing documents. If you're interested, the following article contains further details about Dr. Foster's work. You can contact him at email@example.com or call Vassar at extension x5634. Thank you and good luck. --Sid Fiber >From the book "Interface Culture, How New Technology Transforms the Way We Create and Communicate" By Steven Johnson pg. 155 - 260 Can a machine make sense of language without learning how to read? To answer that question, we need to venture back to one of the enduring mysteries of Shakespearean scholarship, a mystery that was solved by the statistical analysis of digital technology. Of all the arcane puzzlements of Elizabethan literary criticism, few have been as tantalizing, and elusive, as the details of Shakespeare's acting career. For generations, scholars have known conclusively that the Bard performed in every play that he wrote; in two plays, in fact, we know which parts he played: the ghost in Hamlet and Adam in As You Like It. His other roles, however, remain a mystery. Because records of the Globe Theater's production schedule have survived through the ages, we know the run dates of each play in the oeuvre. We also have a reasonably accurate chronology for his writing career, which means that we can gauge with some precision the overlap between performance and composition. In other words, we know that Shakespeare was writing King Lear while he was acting in Othello, and that he was acting in The Merchant of Venice while he was writing Henry IV we just don't know the parts he was playing at the time. Or at least we didn't know until Don Foster stumbled across a brilliant, and strangely reassuring, idea. If Shakespeare had indeed memorized the lines for a part in one play while composing the script for another, then perhaps there had been a little seepage between the two. Perhaps the ritual of performing every night had lodged certain words in Shakespeare's head, like the detergent jingle from morning TV that hounds you through the workday. Anyone who writes for a living will recognize this phenomenon immediately. Words cycle through our daily vocabulary at different rhythms. Certain words stick with us for life, and remain immediately accessible to us at any moment: the names of loved ones, the building-block grammar of our native tongue, the primary colors and cardinal numbers, and so on. Other words wax and wane, in sync with forces larger than the individual speaking them: the fashionable vagaries of slang, the geek-speak of technological innovation, the "ethnic" idiom derived from broader demographic trends. (Think of the influence of black English on the mainstream American dialect over the past twenty years.) Most words, however, lie somewhere in between: drifting in and out of our regular vocabulary, like a band of itinerants cursed with a hankering to settle down. The word profound strays into your head and sits there for weeks, at the very edge of consciousness, primed for use. And for weeks, whenever a situation arises that demands a tone of seriousness or intensity or ironic overstatement, the word profound rolls out like clockwork. But soon enough another contender implants itself (major, let's say, or crucial), and profound retreats to the darkened wings of occasional use. Foster's breakthrough was to assume that even the great Shakespeare might be prone to the same linguistic habits. Was it possible, Foster asked, that words from Shakespeare's memorized lines were accentuated in the Bard's vocabulary during the run of each play? Could the language of Shakespeare's acting career have infected his play writing? It took a computer to answer the question, a computer specially programmed to track Shakespeare's use of statistically meaningful words, words that he used fewer than ten times in his career. The computer analyzed the distribution of these words on two levels: first, their appearances in individual parts (Hamlet's ghost, say, or Midsummer's Lysander); and second, their appearances in entire plays. If Shakespeare the actor was influencing Shakespeare the playwright, then certain plays curve; instead, progress happens in a nonlinear, staggered fashion, with steady, incremental growth punctuated by sudden leaps forward. Take a bowl of water and gradually lower the ambient temperature in the room; for a stretch of time, the change is linear: the water gets colder as the temperature drops. But at a certain point a threshold is crossed, in this case the threshold of zero degrees Celsius, and suddenly you have not colder water but ice, a new property, fundamentally different from the preceding one. A slower machine, equipped to handle less textual information, is nothing more than a literary bean counter good for generating a concordance for a single document, but not terribly sophisticated otherwise. But ramp up the processing power significantly, far enough to do a comparative study of word use in hundreds of documents, not just one, and you hit a threshold point, a singularity. The number cruncher becomes a literary sleuth, outsmarting tenured professors and armchair Shakespeare buffs. In his Atlantic article, Dolnick speculates that Foster's software is a "sign of things to come" for literary studies. But the promise of literary computing extends well beyond the obscure details of Elizabethan drama. By the end of the decade, most personal computers will sport a version of Foster's program as a basic tool in the human interface, as essential to the user experience as windows and icons are today. Perhaps the most startling thing about the Foster study is the simplicity of the program he used. The statistical properties of language, after all, are not limited to word frequency. There are, in fact, hundreds of attributes that the computer can use to build a numerical model of a given text. Which properties you decide to track depends on what you're looking for. Let's say you're trying to gauge the relative complexity of a document - would be littered with the vocabulary from a part in another, earlier play. The scattering of high-information words would be so light and varied as to be unnoticeable to humans, but the computer's prodigious pattern-recognition skills would track it down in a matter of hours assuming, that is, that Foster's hunch was correct. The results that came back from the lab turned out to be as precise and clearly defined as a fingerprint. Each play possessed a mirror-role in another play, revealed by the shared idiom of high-information words, like a family of orphans reunited by the science of DNA testing. In each instance the overlap followed the chronology of performance and composition. The results actually exceeded Foster's expectations. The analysis, as Edward Dolnick reported in The Atlantic, could be confirmed from a number of different angles: "It never assigns to Shakespeare a role we know another actor took. The roles it does label as Shakespeare's all seem plausible male characters rather than women or children. The test never runs in the wrong direction, with the unusual words scattered randomly in an early play and clustered in one role in a later play. On those occasions when Foster's test indicates that Shakespeare played two roles in a given play -Gaunt and a gardener in Richard II, for example - the characters are never onstage together." Using only the limited tools of word counting, the computer had solved a mystery that had eluded sentient, English-speaking scholars for centuries. The sterile number crunching powers of the PC could now tackle more rarefied, nuanced problems, problems that had as much to do with the meaning of language as with its statistical base. Once again, we see evidence that technology rarely advances along a steady You could have the computer monitor the length of each sentence; you could measure syntactical intricacy by tracking the 1 number of clauses separated off by commas, em dashes, colons, 8 and semicolons. Simply calculating the average letters-per-word would probably be enough to differentiate, say, The Cat in the Hat from Minima Moralia. A combination of all three might be sufficient to generate a useful complexity ranking for text documents.
[ Next Message | Previous Message | This Day's Messages ]
This Month's Index |
UFO UpDates - Toronto - Operated by Errol Bruce-Knapp