Tuesday, December 04, 2012

Applied Anarchy

Jean Julien
A lot of people have been wondering aloud what prompted last week's diatribe against the inanity of English spelling; others found it accurate and refreshing. I suppose I should come clean about what motivated me to write it. Along the way, I also want to spell out (pardon the pun) what it is I specifically think can be achieved.

I think it would be useful to provide a systematic alternative to English spelling, one that would get children reading, not at some specific level, but reading anything that's written, after a year or so of not very rigorous instruction. It would, ideally, be so simple that it could be taught in full to illiterate or functionally illiterate English-speaking adults in a single adult education course at a community center, or an online course. It would be great if it were found to be useful in teaching English as a Second Language, specifically to people who do not particularly need gain command of written English, but do want to be able to speak correctly. For reasons that I hope are obvious, it would not serve as a replacement for English spelling. It will not be used for law, business, engineering, medicine or science. It is intended as an alternative, simpler way of reading and writing English. It is a way of providing English with a human-readable soundtrack. It would be helpful if it had a visual appeal and a countercultural mystique, so that artists and designers, including tattoo and graffiti artists, added it to their graphical repertoire. It would be lovely if it went viral on the internet.

The idea came to me as I was thinking of a good demonstration project for an exercise in practical anarchy, where a few individuals acting autonomously can make a big difference and provide an alternative to a vast, entrenched, dominant, horribly flawed system. First, it is a problem begging for a solution: functional illiteracy rates in English-speaking countries are the shame of the developed world, and a lot of that has to do with the fact that English orthography, frozen in mid-17th century, does not reflect how the language now sounds, has numerous patterns and even more numerous exceptions and takes a ridiculously long time to learn. Second, English is the lingua franca throughout the world, and foreigners who learn English generally have no interest in etymology or history of the English language, and to them the spelling system is simply an obstacle.

On the other hand, English is a fairly simple language that could easily be written in a way that follows its phonological form. I believe I can solve it because I happen to be a trained linguist, and although I haven't delved too deeply into English phonology (until now) I know the principles. I also happen to be a software engineer, and, as it happens, the task of making this project work is 1% linguistic analysis and 99% software engineering. I think that it is realistic to make the 40,000 or so books available through Project Gutenberg also available in this new form by piping them all through a piece of software, which is yet to be written. The actual conversion process should only take a few minutes and can probably be done on demand. I think that it should also be possible to provide a browser plug-in that will convert English text on the fly. A somewhat bigger challenge would be to create a smartphone app that would allow users to photograph pages of English text and render them in a phonological alphabet.

This is an idea that is made for just this moment in time, when most text is digital, or available in digital form via the internet. In previous ages, the process of converting entire libraries of books would have been so labor-intensive as to be unthinkable. Each book would have to have been converted by hand, proofread, and then printed and distributed. If a new alphabet were to be used, this would require new typefaces to be cast and new typewriters to be made. To carry this project out would have required a huge clerical staff that would have had to be specially trained for this task, which, once it were completed, would have made their skills instantly obsolete. None of this would be possible without ample government funding. In short, it would have been a boondoggle writ large. Now, however, it is a matter of writing some software.

This is also an idea for a time to come. Access to information in digital form is only as reliable as the electric grid, which, recent experience has shown us, is not particularly reliable at all, with an exponentially increasing incidence of blackouts. Let us extrapolate some trends from the past decade to a time a decade or two into the future. Fossil fuels are still plentiful but so expensive that nobody would think of running a tractor, or a tractor-trailer, to bring food to the people, and so the people have to go where the food is growing. At the same time climate change is making large-scale industrial agriculture increasingly untenable, so that people have to grow their own food using labor-intensive pre-industrial methods. The educational system, which is currently producing high school students with 5th-grade reading skills, has long fallen by the wayside. But people will still want to be able to read to their children by the campfire at the end of a long day in the fields, and they will want to be able to teach their children to read. Will they want to teach them to read by spending years explaining to them hundreds of spelling patterns and making them memorize thousands of exceptional cases? Or would they rather have them learn a small set of symbols, teach them to use them to sound out syllables and to put them together into words, and then turn them loose on any book they can find?

* * *

The many responses that came in after I published last week's post showed that a great many people have no idea that there even is a problem, never mind what the problem might be. A constant refrain was “Works for me!”—I learned English spelling, and so should everyone else. Many more people wrote to tell me that spelling reform is politically impossible. One educationalist accused me of being against phonics—which is a way of teaching English reading by pointing out the various ways that letters can map to sounds, versus the “whole word” approach. Actually, I consider phonics to be the lesser of the two evils. One reader thought that English should be written using Cyrillic alphabet. Too late, it already is. Just look at the storefronts in Moscow and St. Petersburg: they are crammed full of transliterated English, some of it barely recognizable as such. Another made the commonsense but not entirely workable suggestion that we use the International Phonetic Alphabet. IPA is the professional tool which linguists use to describe speech sounds, but it has never been used directly to create an orthography for any language that I know of, for many reasons, one of which is that it's really quite ugly and hard to decipher. Only one actually went as far as acknowledging that there is a problem, but several expressed incredulity that the problem even existed. Perhaps I should have provided more references. Well, better late than never, and so here is a short summary of the problem, from the English Spelling Society:

English grammar and punctuation are relatively easy. But English spelling is quite the reverse - probably the most irregular of all alphabetic systems. Not only can you not tell how to spell a word from hearing it spoken; you can’t even be sure how a word is spoken from the written word – a unique “double whammy”.
The reasons for this irregularity are complex and largely historical. But the economic and social costs are serious. English speaking children take on average three years longer to learn to read and write than others and some never succeed. Our dyslexics struggle in a way that Italian and Spanish children do not. Adult illiteracy remains stubbornly high (23%).

I think the 23% number is being too kind; the functional illiteracy rate is much higher. If you click the irregularities link, above it will take you to a sort of English spelling cathedral of shame which, if you read through the entire list and try to make sense of it all, will probably leave you shaken. Is it really that bad? (Yes, it is.) And does making our children learn it classify as child abuse? (You decide.) Lastly, is all of this artificial complexity even necessary? (No, definitely not.)

This level of complexity and irregularity imposes a large cognitive processing overhead on those trying to learn to read and write English. Here is a diagram from a paper by Ram Frost titled Orthography and Phonology: the Psychological Reality of Orthographic Depth. He looked at the difference in the process of deriving sound from graphical form between the “shallow” orthography of Serbo-Croatian, where each letter represents a phoneme, and the “deep” orthography of English, where no such one-to-one correspondence exists. Apparently, the mind of a person who is learning to read and write English is crammed full of such nonsense. By the time the learning process is complete, the reader starts looking up the phonological form of the whole word, as if it were a random hieroglyphic; thus, no matter what the teaching process is, in the end learning to read English involves rote memorization of the written form of each word. It is little wonder that so many people never complete the process. Is there a better way? Well, not at the moment, but, obviously, there ought to be.

* * *

Babies are born ready to learn a language (or two or three): it is part of their innate developmental program. They do not need to be taught to babble. From just a couple of months old they start to spontaneously produce consonants and vowels. They start with just a few consonants and with just about every vowel and diphthong imaginable. Over time, their consonant repertoire increases while their vowel repertoire shrinks down based on what they hear around them. They start with single syllables, and eventually learn to string them together into words and phrases. Eventually the two complementary systems involved in language perception and production—the perceptual and the articulatory—become dialed in to a specific language, with its specific inventory of phonemes and phonological rules.

Phonemes are not physical but psychological in nature. They are not something that can be picked up by a microphone or analyzed by shooting an x-ray film of a speaker's mouth. Those are allophones, which are speech sounds produced by feeding a sequence of phonemes through a set of phonological rules. Phonemes are at a higher level of abstraction, and evidence for them, as with all psychological phenomena, is indirect. However, the existence of a phonemic inventory for each language is perfectly uncontroversial. The set of phonological rules is learned automatically and unconsciously, along with all the other automatic processes of language acquisition. One set of rules determines which phonemes are mapped to which allophones under what conditions. Another set of rules, in English as well as many other languages, such as Russian and Portuguese, governs vowel reduction: unstressed vowels decay to something shorter and generally indistinct, often called a “schwa.” (Think of the difference between the sound of the first 'o' in “psychology”/ “psychological” or the second 'o' in “photography”/”photographical”).

Consider the following minor (very minor) miracle: speakers of different English dialects can learn to understand each other without being taught to do so in school, and, in fact, without much effort at all. This is true even for those of them who are entirely illiterate. This is because they all have substantially the same underlying, psychological representation of English in their heads, which they express differently, via different sets of phonological rules. These rules do not need to be taught but are learned spontaneously, simply by listening. Most people learn the perceptual portion of the rules, allowing them to understand other dialects. Some people learn the articulatory portion as well, allowing them to sound British or Scottish or Irish or Australian, or, my personal favorite stealth dialect, Canadian. Thus, what makes English one single language has little to nothing to do with the way it's written. It is one language because it has a common phonological representation in the minds of its speakers. This allows them to understand each other without any reference to the way the language happens to be written.

Having thought about this for a couple of months now, I have come up with a set of conjectures that make the task of creating a shallow English orthography much easier. Here they are:

1. There is a specific phonemic inventory that is largely invariant across all the major English dialects
2. English dialects only vary mostly in their phonological rules; the underlying phonemic representations are substantially the same across English dialects
3. There is no phoneme corresponding to “schwa”: there are only vowel reduction rules which are learned spontaneously and automatically and do not need to be reflected in the orthography
4. Differences between English dialects that cannot be captured using a common set of phonemes are lexical differences that no orthographic representation can ever hope to bridge. Simply, certain words have to be written differently across certain dialects.
5. Thanks to Hollywood films (which make money by being shown without subtitles in all English-speaking countries) the best-understood English dialect throughout the world is General American, so that's the best one to serve as the basis for the alternative orthography. However, the phonological representation of GA can be relatively dialect-agnostic.

What is this common phonemic inventory? Here is the entire phonemic inventory for every dialect of English. It is an excellent tool for capturing the exact sounds of every dialect of English. But it is simply too large to serve as a basis for an orthography. But I have discovered that it can be pared down substantially for representing the phonological representation of English that is valid across dialects. Here is what I think is a minimal set, which I derived by looking at the presence of minimal pairs across major dialects. (The phonemes are shown between slashes, the allophones—in square brackets.) The vowels are the most troublesome, because there is potentially a very large inventory of them across dialects. But they can be pared down substantially by paying attention to the distribution of minimal pairs.

/ɪ/ , /i/ (shit/sheet) — rather important distinction
/ᴧ/, /ɑ/ (cut, father) can be expressed as one phoneme /a/ because there are no minimal pairs except, perhaps, come/calm and bum/balm, but since the 'l' is sometimes pronounced, why not just write it that way?
/æ/ (cat) — in RP (British “Received Pronunciation”) it is often pronounced [ɑ], causing ambiguity
/ɛ/ (bed)
/o/, /ɔ/ — can be taken to be two allophones of /o/ which sounds different depending on its context within a word
/ʊ/, /u/ (pull/pool)

Thus our minimal vowel inventory across all dialects is taken to consist of just these eight:
/ɪ/, /i/, /a/, /ɛ/, /æ/, /o/, /ʊ/, /u/

Liquids: /l/, /n/, /r/, /m/

Syllabifying consonants:
/ḷ/, /ṇ/, /ɚ/ (bottle, button, butter) — these are consonants that act like vowels. Plenty of dictionaries insert a schwa in front of them, but, as I said, at the phonological level the schwa doesn't exist

To simplify things further, so-called “r-colored” vowels I take to be just regular vowels coarticulated with a following /r/, while diphthongs are taken to be just two coarticulated vowels: /oʊ/ = /o/+/ʊ/

The rest: /j/, /s/, /z/, /w/, /ʧ/, /ʤ/, /t/, /d/, /h/, /ŋ/, /k/, /g/, /f/, /v/, /p/, /b/, /ʃ/, /ʒ/, /θ/, /ð/

This gives us just 35 phonemes that need to be represented using unique symbols, which is a perfectly reasonable size for an alphabet. But it can be paired down further. Observe that there are eight consonant pairs that differ in just one feature: one is unvoiced, the other is voiced: /s/-/z/, /ʧ/-/ʤ/, /t/-/d/, /k/-/g/, /f/-/v/, /p/-/b/, /ʃ/-/ʒ/, /θ/-/ð/. It also turns out that, of these, the voiced ones occur only half as frequently as the unvoiced ones in English speech. Therefore, there is no reason to waste an entire separate symbol on each eight voiced ones. We can represent them as unvoiced ones with a “voicing mark” such as the one used in the two Japanese syllabaries: /g/ = /kv/, etc. This gets us down to just 27 symbols—one more than the Latin alphabet.

However, the Latin alphabet happens to be the wrong choice. Yes, it contains 26 different letters, but they are not the ones we need. It is possible to borrow diacritical characters from other languages, but the result will look foreign. (You may think that foreign looks cool, but I think that extraterrestrial looks even cooler.) Also, any attempt to recycle the Latin alphabet would result in something that looks like English horribly mangled and misspelled. For all its faults, written English does have a certain consistent aesthetic, which the alternative would lack. It would start out as a graceless hack, and would be instantly despised. It is better to start with something that is, at the outset, completely illegible, but where a few hours of effort later the sounds of words start to spontaneously pop right into one's mind with no additional processing required.

To wit:
For ol ıts folts, rıtṇ Iṅglış daz hæv a sṛtn konsıstent esþetık, wıc ðe æltṛnatıv wud læk. It wud start aut æz a greisles hæk, ænd wud bi ınstantli despaizd. It ız betṛ tu start wıð samþıṅ thæt ız, æt ðe autset, komplitlı ılejibl, bat weṛ a fyu auṛz ov efṛt leitṛ ðe saundz ov wṛdz start tu sponteıniaslı pop rait ıntu wanz maind wıð nou ædışṇal prosesıṅ rekwaıṛd.
To illustrate my point, I spent a few minutes coming up with an IPA-to-Latin mapping that wouldn't look too ugly, borrowing a few letters from Old English/Icelandic, a couple more from Turkish, and a few more from IPA, but the result is still startlingly ugly. There is a strong interference effect, which no amount of fiddling with the mapping will ever eliminate. The symbols have to be fresh ones, with no preexisting associations of any kind, so that people who see them for the first time can pass no judgment on them. By the time they figure them out, they have breathed the air of freedom, realize what they have been missing, and the change in them becomes irreversible. So far, people have proposed using IPA, Extended Latin, Sampa, Cyrillic, Greek, Shavian and Deseret. None of these will work.

* * *

The experiment, then, is as follows:

1. Compile a phonemic dictionary of English from various text-to-speech dictionaries by running vowel reduction rules backwards

2. Invent a set of symbols to represent all the phonemes

3. Compile a corpus of English literature and a browsing tool that uses the alternative orthography, plus some learning tools

4. Wait for the epiphany: “OMG I can read this, and it's written exactly how it sounds! Wow!”

And after that, who knows what will happen. And that, I think, is the beauty of practical anarchy.

60 comments:

Slavito said...

What about using the Greek alphabet as the basis and adding a few IPA characters (œ, ø, etc) as needed?

Dmitry Orlov said...

That would make it look foreign. I believe that's an instant disqualifier. It can look extraterrestrial, but it can't look like a non-English Earth language.

Anonymous said...

What a great project idea! Do you have a support system in place, while you develop it, or, do you need assistance?

Dave said...

Just for interest's sake, a few Mormons tried this as they were settling the Salt Lake Valley. It would seem that most rejected the idea. Following is a link with some explanation as well as printed examples including an English equivalency chart.

http://www.utlm.org/onlineresources/deseretalphabet.htm

Nathan said...

This would make a great Kickstarter project. I am assuming it would be nice to have a little money to devote the time and energy to getting this off the ground into an alpha or beta test. I would be willing to donate time and/or some money to see this happen. But I don't know anything about linguistics or orthography design - I would be a poor choice for anything but doing programming behind the scenes.

Wiglaf said...

Tolkien's Tengwar (Elvish script) and Angerthas (Runes that are reminiscent of Anglo-Saxon runes, but aren't) are attractive to me because the relationships between the shapes of the letters are, as far as is feasible, isomorphic to the relationships between phonemes. The idea is that once you know either system, it is fairly easy to transliterate any language (as long as the phoneme set isn't too exotic). I devised a fair number of English variants in my teen years.

Stanislav Datskovskiy said...

Consider the http://en.wikipedia.org/wiki/N%27Ko_alphabet#Unicode. I found it by leafing through the Unicode charts, and noticing that most of the symbols I had in mind were already in that one place.

Dmitry Orlov said...

etresoi -

I don't think it will take much capital to get it off the ground, but I wouldn't mind a collaborator or two.

Dave -

I bet the Mormon thing would have gone further if they had a computer program to convert the entire English corpus to their system.

Shadowfax said...

In my job as a marine electrician I have meet a scary amount of functionally illiterate adult males.
Some are successful businessmen.
They tell me they"hate reading manuals"
They really mean they can't read them.
I can't imagine a world where I only get information from the TV.What a narrow world view they have!

Unknown said...

Dmitry, I think you would like:

http://ententetranslator.com/btrspl.html

Robert Alan Mole, at his Entente web site, has a lot to say about language and English Spelling reformation has gathered a lot of information about it. From his website, you can download a program (BTRSPL) to take regular English spelled documents to a simplified spelling. Also he has many references to other reform spelling sites and to books already translated to a simplified spelling.

Anyway, I enjoy your comments about this long neglected issue.

Aldabra said...

How is functional illiteracy in Chinese?

Anonymous said...

Based on your criteria I'd say the project (including the software and typeface) could fit in the Technology, Design and/or Publishing categories on Kickstarter. The final product could be a website, and each contributor could receive a printed copy of the book of their choice in the new alphabet.

Rien said...

I am not very familiar with Esperanto, but how would this compare to Esperanto, en how could it avoid the same fate?

Lunchista said...

Chairman Mao had all the most-used Chinese characters "simplified" at about the time of the revolution (1949) for precisely that reason, so that they could start a massive literacy drive. Recognising about 2,000 of them enables you to read a newspaper.

Some of the characters contain "clues" as to how they're pronounced, but they are even less reliable than English.

Sorry I don't know what today's literacy rates are, mind you.

BillSeitz said...

Wikipedia has some nice coverage of past designs for reformed alphabets and/or spelling rules. http://en.wikipedia.org/wiki/English_language_spelling_reform

Anonymous said...

I have to disagree with creating a new set of symbols for this project. As bonkers as English is, the letters themselves have associated sounds, which anyone familiar with any Latin alphabet-using language already knows. I think potential users of your system would be more intimidated by, and less likely to take upon learning, 45 new symbols than by a "foreign" alphabet of letters they recognize plus diacritics and unfamiliar symbols. Using existing common character sets would also allow people who haven't downloaded your programs to adopt your new orthography, which seems more in keeping with anarchic principles.

Jean-Paul Printemps said...

Graffiti I've seen shows that the alphabet can be made to look extraterrestrial. The alphabet has an ethical basis to its order and pictographic scheme. This you can read about in one of Robert Graves books.

Sumerian (or Phoenician) ethics may not be relevant to the coming age. Still I think some circumspection is useful when dealing with "the devil you don't know."

HierósKórax said...

Jean-Paul Printemps -

Don't trust anything Robert Graves said. His imagination and erudition was outstanding, but no expert takes seriously his theories about etymologies, etc. The "White Goddess" is a marvelous book, but is practically science-fiction (or linguistics-fantasy to be more precise).

Moreover, what Dmitry proposes about breaking with current word-history is not a problem, it's a feature. It's all about paring down the system from the quirks that result from transmission through centuries.

However, I think this has already been done. The shavian alphabet may meet the requirements with a couple of modifications.

JimK said...

Our crazy spelling lets us distinguish words like "I" vs. "eye". An improved spelling system still needs some mechanism. How about some kind of parenthetical disambiguator, some purely graphical suffix. I think that Chinese characters do this sometimes. It could be another word, like ai(person) vs. ai(organ). Except that's too long. Maybe just an abbreviation ai/p vs. ai/o.

Sven said...

Both deseret and shavian seem to meet your criteria of looking interesting

Dmitry Orlov said...

JimK -

A good comment. English has lots of ways to spell the same word, e.g. you, yew and ewe are all the same spoken word, /jʊ/. But it also spells different words the same way, e.g., read/read, lead/lead, live/live, etc. The mind disambiguates homonyms automatically and unconsciously without such orthographic crutches, but it stumbles when different words are spelled the same. They are an unnecessary complication. Nobody ever complains about bat, bat and bat being spelled the same. A text that relies on orthographic crutches to be understood cannot be read aloud without losing the audience. In short, the spoken word is the actual living thing, and the way it is written down is just a device. In the case of English it's a broken device, hopelessly out of date and bunged up with random garbage.

Lance M. Foster said...

"/ᴧ/, /ɑ/ (cut, father) can be expressed as one phoneme /a/ because there are no minimal pairs except, perhaps, come/calm and bum/balm, but since the 'l' is sometimes pronounced, why not just write it that way"

Butt/but /bᴧt/ and Bought /bɑt/, is another minimal pair in General American (my native dialect) and these have no l.

I will throw my hat in the ring to help if you like. I have basic linguistics, speak General American as my native dialect, and also am an artist.

Dmitry Orlov said...

Butt/but /bᴧt/ and Bought /bɑt/ — I would put the first in the /a/ bucket and the second in the /o/ bucket, and let phonological rules handle the rest. Contact me via email if you want to help.

Raymond Duckling said...

Jelow Dimitri.

Ai tink yu ar way bejaind in terms of Inglish riformeishion. Mexican estudents jab develop't a fonetic system for Inglish as a second lenguash, wich is brokenly inflict't upon oz as part of ofishial edukeishon programs (Ai did 6 yirs of Inglish in yunior-jai and jai-skul).

Ai aknowlesh many sobt'l diteils ar lost, sinz fonems betwin de tw languashes ar not alweis de seim. But Ai tink its posibol to translitereit with a latin caracter alfabet wanz yu now de wei douz ar uter'd in Espanish.

=========================
Hello Dmitri.

I think you are way behind in terms of English reformation. Mexican students have developed a phonetic system for English as a Second Language, which is brokenly inflicted upon us as part of official education program(I did 6 years of English in junior-high and high-school).

I acknowledge many subtle details are lost, since phonemes between the two languages are not always the same. But I think it's possible to transliterate with a Latin character alphabet once you know the way those are uttered in Spanish.

JimK said...

Yeah surely the spoken language is primary. But even in spoken language people will occasionally resort to spelling aloud to disambiguate words that sound the same. This happens in other languages too. Maybe if context isn't sufficient then folks should just use a phrase of some sort. But spelling is often the quickest easiest way.

Lennon C. Tucker said...

Do it. From a collapse perspective, it would be very helpful if learning to read took much less time. We might not have the time or the funds to allow our children to waste them both sitting in ineffective schools learning a system that is nearly as difficult as kanji, but lacking the beauty that allows kanji remain at least useful as an art.

Anonymous said...

it alrdy xists. its cld txtng. and its BAD 4 gd thnkng!
---
I enjoyed last week's post, but it's hard for me to decide whether orthography or grammar is more important to learning a language easily. As a native Russian speaker, you may underestimate the complexity of learning six cases, verbal aspect, verbs of motion with all sorts of exceptions, etc.

George Bernard Shaw left a chunk of his fortune to the standardization of English spelling. Not a lot came of it.

beetleswamp said...

I've been waiting for someone to get this over with for a long time. I majored in English and am still quite hopeless without spellcheck. When I moved to Hawaii and took a Pacific Lit course with Albert Wendt he made it very clear that the fastest way to colonize a people is to destroy their language. Pretty much the whole course we were studying people's different attempts of reverse colonialism. This seems like a huge step in the right direction to give the language back to the people.

Kristiina said...

Hehee, absolutely thrilling project! I guess linguistics and programming would be the skills that will be needed for this, and I have my skills in other directions, but I'd love to contribute somehow.

I am still thinking about the character of languages, and how it influences the human character. As we humans have such flexible minds, the brain develops around this tool of language. And then a writing-system develops on top of this. It seems originally using signs/symbols was considered a potent magig. Invoking something into peoples minds that is not actually/physically present. A tool for magic. Now that magic has mostly turned into black magic: nasty little programs/automomous complexes/demons/advertising and propaganda/ (choose your favored term to fit your world-view). Stuff that bogs down the fine tool of a functioning brain into a broken record that keeps repeating some idiotic fragments. Rubbish that will ruin a fine instrument. So, this idea is so incredibly fine in that it starts to clear out the garbage. Exactly waht we need to do, on every dimension of existence. Sort out what is necessary and useful, and respectfully put aside what cannot be used. Or maybe invent some new uses for it.

Many have suggested some already-existing writing systems/alphabeths, with modifications. I like it how you want to make something entirely new. A creative act, not just re-arranging something that already exists. Wishing you well, wishing all success for your project!

Justin Patrick Moore said...

Ever read the novel Riddley Walker by Russell Hoban? It's an excellent post collapse tale set in England, and the whole thing is spelled out phonetically (in latin letters of course), but it is a great read. Worth checking out for the story as much as the textual experiment.

I'm enjoying this series of posts on applied anarchy. Thanks!

cmaukonen said...

I love the idea.

HA Written English. Thea reason why spell checkers will give you the correct spelling of the wrong word half the time.

But I digress.

There is only one language I can actually read - that is pronounce from what is written.

Yiddish.

I can read and pronounce Yiddish correctly nearly all the time.

Current written English I feel is the language of a putz.

David H said...

Speaking as an artist I'm wondering if these new characters could be developed so that they visually have some relation to the phonic sounds. For example, vowels let the breath flow while consonants impede the flow of the breath. Really basically if vowel symbols had flowing lines while consonant symbols had straight/intersecting lines the visual look of the words might help reflect the sounds. Wouldn't it be interesting if this also resulted in beautiful sounding words also ended up reflecting visual beauty in the written word?

nulinegvgv said...

I have a four year old who cannot yet read (but appears nearly ready) and a 6 year old who is reading but still new at it. Between the three of us we could provide a broad test range.

Lacy Thompson Jr said...

Here's a link to my personal stab at the problem that dates back about 10 years. http://lacysinter.net/Projects/Phonetic%20Reader-Writer/Phonetics%20Developing%20Ideas.htm
I am a slow reader and wanted something that might help. We REALLY need to start with a BLANK sheet of paper and reinvent Reading and Writing to be closer analogues to speech.
In parallel with retooling reading, the keyboard must be retooled as a phonetic based instrument. Five fingers gives you 2 to the 5th or 32 states that could be expressed through a chording input device which could be buttons around the periphery of your smartphone. Thumb could actually activate 3 states push, down, or up or 2 the 7th 128 states. (subtract one since one of the combinations is no input).
Imagine reading and writing emails by chording on your smartphone! could be done inconspicuously. Reading would be a tactile feedback from the buttons.
The really neat thing about going over to a phonetically based system is that dialects and emphasis can be registered by the strength of the particular phoneme. Strength could be visually represented by the point size of the symbol or the strength of the tactile feedback.
There is a phonetic dictionary database at Carnegie-Mellon http://www.speech.cs.cmu.edu/cgi-bin/cmudict also http://www.festvox.org/cmu_arctic/
The current state of the art for reading and writing is so poor that I envision better tools being so superior that the old systems could not compete for reading or writing or comprehension. Imagine a trainer on your smartphone where you start out with "See the cat run" and you get the audio, the visual symbology and the tactile feedback simultaneously. I predict within a week you would be far more proficient reading, writing and comprehension wise with the new system it wouldn't even be funny!
If any one wants to collaborate on this, I'm game. My background is as an entrepreneur of high tech professional audio equipment. www.LTSound.com
Lacy

Ragtag Bunch said...

Two thoughts to consider in the formation of a new English orthography:

1) Examples of existing constructed orthographies for English: http://www.omniglot.com/writing/conscripts.htm#english
Most of these are "experimental" and impractical, but do provide insight.

2) Lexicalisation. A benefit of lower-case script is that letters with ascenders (b d f h i j k l t) and descenders (g j p q y) increase legibility. When a reader becomes sufficiently comfortable with a script it becomes easy to recognize word-forms lexically by their shape.
It has also been suggested that the syllabic blocks of Korean Hangul increase legibility by representing syllables as easily repeated textual morphemes. In this way familiar syllables are recalled from the lexicon, while unfamiliar syllables can be quickly deduced from the individual letters.
Some Hangul fonts utilize both of these aspects.
eg. UnJamoDotum (http://www.wazu.jp/gallery/views/View_UnJamoDotum.html)

Anonymous said...

Extraterrestrial, hmm? I offer that there are 40 box drawing characters in code page 437 (0xB3-0xDA, iirc). Among the 40 are 7 glyphs with full-length double horizontal lines, which along with their single-horizontal-line mates could be allocated to the voiced/unvoiced pairs, with the eighth pair represented by, for instance, the single upper-left corner with single or double horizontal lines 0xDA and 0xD5.

It could even make for a futuristic looking cursive that would resemble an EKG with a few extra crosses and dots, and appear sufficiently distinct from Arabic, Thai or Devanagari writing to pass the novelty standard. Plus, being a simple cursive script, it would be fairly easy to write with no technology, making the system a full-duplex, read-write orthography which could easily displace modern English orthography in pedestrian daily usage, just as modern English partially displaced Latin as the language of the learned (lernid or lernd? ;) ) over the past few hundreds of years.

As to technology, there exists a Perl module Lingua::EN::Alphabet::Shaw or something like that which translates/transliterates English text into Shavian, guided by the CMUDICT pronunciation dictionary. It might make a worthwhile basis for a server-side or batch translator a la the many Internet-meme dialectical humor translators for pirate, lolcat, ermagerd, etc.

Mean Mr Mustard said...

"But it can be paired down further."

Of course, you actually mean 'pared' - which nicely illustrates the problem.

Dmitry Orlov said...

But I mean both pared and paired at the same time—pared by being paired into voiced/unvoiced. How do you spell both at the same time? Easy: pɛɚd.

Dmitry Orlov said...

Lacy Thompson -

I think you are thinking in the right direction. I want the new set of symbols to be able to represent all the phonemes. I also want them to represent both stress and tone without additional graphical elements. I want a streamlined entry method (for as long as there is electronics) using a small set of touches and swipes on a touchpad, and, after that, to be easily renderable using a brush, a reed pen, a stylus, a chisel or by arranging glazed ceramic tiles. I also want to to directly work as a Braille-like tactile pattern, automatically map to a semaphore system, and be simple enough to teach to special ed. students with severe physical and mental limitations.
Plus I want it to make beautiful, artistic calligraphy, like Chinese, Japanese or Korean. Oh, by the way, I already have a design candidate that does all this. It's amazing what you can achieve if you toss history out the window and design from scratch.

Lacy Thompson Jr said...

Dmitri, there is also much room for improvement in how the sentence syntax is grouped and displayed, particularly if you are looking at reading speed and comprehension. Distinctions between nouns, verbs and adjectives could be made in a variety of ways; placement on above or below a horizontal reference line being just one. Enlarging to goal to improving communication and avoiding miscommunication is yet another layer to the onion.

Kraig Grady said...

anythang that makes sir-veil-lance hardur i thank is a goood idee-a

gaias daughter said...

I used to teach kindergarten and first grade, so I know first hand the frustration of students as they begin the long road to literacy. Then I came across a program written by two teachers that made that road much easier and a lot more fun -- http://www.center.edu/iPad/dekodiphukan.shtml Their program includes a storybook (suitable for camp-fire tales) that introduces the 44 sounds as picture clues, as well as flip-books that provide practice in blending the sounds to make words. Finally, there is a transition program that eases the student into traditional spelling. All that is missing is the new alphabet -- and one could be devised that would be a simplification of the pictures (the breaking stick sound of "k" could be written as an upside down "v" for example). Of course, one would need the author's permission -- but he seems to be a closet anarchist himself.

Jonlongstrider said...

A little pamphlet written in ITA taught me to read when I was four or so. A poster that explains it is here. Something about the adventures of a "skwirrel". Oh, ITA is short for Initial Teaching Alphabet.

Wolfgang Brinck said...

I arrived in America from Germany at the age of 12. I was put into 7th grade where one of the curriculum items was spelling. Every week, we would be given a list of words that we had to learn how to spell.
It never occurred to me at the time, but the notion that students should still be learning how to write their language in 7th grade is an odd one. I never gave it any thought but on looking back, I now realize that when I left Germany, there was no instruction in how to spell German words. By 7th grade that skill was already mastered.
Back to American spelling tests. I did well on them by means of a ruse. Unlike native American students, I did not have to memorize words explicitly. The idea of memorizing the sequence of letters that made up a word in the English language seemed absurd to me. What I did instead with the weekly spelling list was to sound out each word as if it were a German word and memorize the sound of it. Having memorized the German sound, I could then use the regular German representation of the sound to recapture the spelling. This might be a subtle point, but memorizing the sound of a word which I was doing is easier than memorizing a letter sequence which I assume is what the other students were doing.
So I never learned how to spell in the conventional sense. If asked how to spell convention, for example, I would have to sound out the word in German, then picture the letters that represented that sound, then read them off one at a time from that mental picture.
As it was, using my own home-grown spelling technique, I was able to do better at the spelling tests than most of the English speaking natives.
I no longer use the trick of sounding out words in German. I imagine that I now acquire new words like native speakers, that is, by learning whole words by sight without any phonetic help. It seems as a consequence, my ability to spell correctly has declined.

Dmitry Orlov said...

Wolfgang -

Yes, that's a common coping mechanism for people who speak languages that can pronounce just about any sequence of letters. Streichhölzchen. Zdravstvujtje (that's Russian for "hello"). Szczebieszczyna (small town in Poland). Whatever. I once stood at the license renewal line at the DMV behind some East European, and when it was his turn to read the eye chart he didn't read it, he pronounced it! All of it! Blew away the woman behind the counter. I think good advice for somebody intending to learn to read English is, learn to read some other language first.

subgenius said...

off-topic, but you have an anarchist next door (dock?) Dmitry...

Stanislav Datskovskiy said...

Wolfgang Brinck and Kollapsnik,

This is exactly how I handled English spelling, when moving from a Russian 2-nd grade to a U.S. 3-rd. It works.

Unknown said...

@kollapsnik

I really like the idea, I had a similar kind of thing this past summer. I was working as a business consultant and found using OO based diagrams could take 60 minutes of text down to 10 minutes of drawing.

Here is where I show my work,
http://americancrackpot.blogspot.com/2012/12/editing-text-for-brevity-few-exercises.html

Unknown said...

Hello,

Way back when the Shriekback 12" album Oil And Gold had the lyrics on the inner sleeve in phonetic English. I thought it was a pretty good attempt.

I no longer have the vinyl album, and on the CD the text is too small...
I tried to find an image on the web, but I couldn't.

Does anyone have it?

cmaukonen said...

Kollapsnik,

Pronouncing the letters is where I get hung up. I read/pronounce the words.

Side note. Those who are proficient at morse code can "read" both the letters and the words.

Ian said...

I agree with those who are saying that as long as you're redesigning anyway, you may want to have letters that give clues as to their pronunciation. Recall the famous Bouba/Kiki effect.

Japanese, which uses Chinese idiograms, also cheats and has a (nearly) perfectly phonetic alphabet. That's so easy to teach to children that they have two separate phonetic alphabets, and kids are expected to know both perfectly by second or third grade.

Ian said...

I agree with those who are saying that as long as you're redesigning anyway, you may want to have letters that give clues as to their pronunciation. Recall the famous Bouba/Kiki effect.

Japanese, which uses Chinese idiograms, also cheats and has a (nearly) perfectly phonetic alphabet. That's so easy to teach to children that they have two separate phonetic alphabets, and kids are expected to know both perfectly by second or third grade

Andy Brown said...

Well, this is fun. I think your example orthography is readable and doable, and just because you haven't made it beautiful, doesn't mean anarchy can't. After all, remember that the whole problem here is that language is an anarchic system (despite the best efforts of prescriptivists everywhere) while the spelling system has been kept under gerontocratic thrall by teachers and manuals and the durability of bound paper. Put it in the hands of your graffiti artists. Lets have an end to orthographic autocratism!

Unknown said...

Excellent discussion! This obviously has to be done anarchically, as numerous previous hierarchical attempts have failed. No point in trying to agree a 'standard' beforehand, writers just go for it including a simple pronunciation guide as a preamble to everything they write. Over time one system will have the most supporters and will gradually be adopted by those who want to communicate to the widest audience.

DurangoKid said...

I would favor an alphabet that only minimally introduces new characters or reassigns existing ones. For heaven’s sake, don’t start with Greek or Cyrillic! I would also reject the idea of accents. Latinized Vietnamese script resembles a macramé project. My other main objection would be the problem of changing words to the point where their origins become difficult to discern. English is a mixture of several languages, hence the problem with spelling. It might be useful to do a statistical analysis of large texts to see which spellings are most common for a given sound and start there. I’m also not opposed to diphthongs as long as their usage is consistent. This could go a long way in eliminating the dangling “e” and easing plurals and gerunds. By the way, I had to use a spell checker about six times to write this.

Joe said...

I went through the English alphabet this morning and came up with an idea similar to DurangoKid's. My resulting suggestion is to eliminate the letters q and k, replacing them both with c and pronounced k. A small change like that should go over much more smoothly and not face so much opposition.

Anonymous said...

This is my first time posting a comment...and while I disagree with your solution to the complexity of the English language, it is a problem that needs to be tackled somehow.

My main issue with it is...the correct usage of the language can also be the least efficient, both in informal (and formal) use. I can't count how many times a formal business dialogue had devolved to basic 5th grade English. I could go on and on...good work!

Brett Bazant said...

What a perfectly wonderful idea!
Optimality aside, how about selecting characters from the set of bare 7 bit ascii letters so that the scheme might have the fewest possible barriers to proliferation, and every typewriter and IBM PC could handle it?

"Close as possible to existing english, but no closer" might be a way to find a path of least resistance. So how about the strategy of taking existing spelling, sort of, but simplifying as much as possible?

I guess I'm advocating something which people can dabble with with existing computers and stuff, I know unicode exists but... well it is overcomplicated and high overhead. Plain text files, plain ascii, less friction.

onething said...

Dmitry, I think you have thought well and are on the right path. I used to win the spelling bees in elementary school, but I have a perfectly smart daughter who really struggled to read and spell. I guess she got it (or didn't get it) from her father, who also doesn't spell well. Yet he's a very good engineer, and also a very fine speaker and writer!

When I have told people that I can speak a little Russian, they balk at the alphabet. Not the very difficult Russian language but the alphabet, which you can learn in a couple of days. I point out that they did learn one alphabet; the next one is a breeze.

My sister and I used to make up an alphabet so that we could write in code.

Reading both alphabets, I found that when I had spent some time reading Russian, the letters which are the same (A, E, P, C, M, etc.) are the ones which gave me trouble. Some look the same but sound different, and some are completely the same. The ones which look the same but sound different are of course the ones which caused the most trouble.

I agree therefore that what is needed is an alphabet that is completely different, so that there is no crossover confusion. It would be only 30% more time consuming to learn, and yield far smoother results ever after.

I'm not sure I like the squiggles and symbols. I'd rather have more letters. We should have letters, as Russian does, for the ch sound, the sh sound and the ng sound, for example.

Niffiwan said...

Upon some reflection, I think that it would be extremely useful to have a simple program or browser plugin that could translate any English text into dialect-specific IPA, using the rules here:
http://en.wikipedia.org/wiki/International_Phonetic_Alphabet_chart_for_English_dialects

Do you think this is something that you could write? I can really see this being useful at making the connection between the way words are written and the way they are spoken in whatever part of the English-speaking world someone lives. Because I think it is not true that all of us have the same conception of "phonemes" in English, I think they overlap largely but not completely.

I liked your example of phonetic English, although I did have some trouble with the "æ" sound, which I pronounce as "a" in some words. For example, "æltṛnatıv" I would instead write as "altṛnatıv" (I live in Toronto, by the way).

Although I excel at English spelling (my first language is also Russian, by the way), I to this day have difficulty with pronouncing words because the spelling gives little hint of this. And there are many words that I see on paper but rarely or never hear used.