User talk:Ooswesthoesbes/Arsjief 3



Arsjief

How do you make Oceana?
Hey Oos,

I've always wondered how you make Oceana-- is it truly just a wonderful little dump of creativity of a fake language you're making? Or is it heavily based on other languages? It's certainly realistic and is ostensibly consistent in spelling and general form. In other words, Oceana seems too good to be truly artificial.

I ask because I think it would be interesting to run analysis of Oceana to determine such things as how many words are known to exist, the average length of words, and how close Oceana is to other languages in terms of letter sequencing.

I don't know if you're programming inclined, but one thing that could be done is use a basic dictionary of Oceana words, and give this dictionary to a computer that would figure out how often characters come after each other, and finally tell this program to generate what it perceives as reasonably Oceana-like words.

A RMACHEDES 23:49, November 18, 2012 (UTC)


 * Personally, I don't think such results would be very illuminating, since f.e. the average word length doesn't tell you anything interesting about the linguistics of the language. I agree Oceana seems very naturalistic (though I'm not convinced by the history :P) and I'd like someone with a more than amateur interest in linguistics to comment on this. :) --Semyon 00:02, November 19, 2012 (UTC)
 * Hahah, actually, the realisticness of the first words I created is very low :P The more recent additions rely on grammatical endings such as -nost, -ni, and -bechen and on a nearly fixed way of changing from Slovak to Oceana.
 * In Lovia, we have about 2,500 words. On my computer, there are another 3,000. That makes 5,500 words in total :P
 * The average length of words would not be that interesting, as Oceana is an agglutinating language. To illustrate this, "length of words" will be only one word :P
 * I must say it would be interesting to run some "professional" analysis on Oceana and compare it to languages like Dutch, Slovak, and English. Probably, it will still be fairly close to English, maybe with "t" and "s" being slightly more common and "e" slightly less :P --O u WTBsjrief-mich 04:17, November 19, 2012 (UTC)
 * I can hack together a little script that can do that for you, if you'd like. All I'd need is a decent-sized corpus of text. I could compare results of looking at Oceana with results from English or Dutch, too. If you have a collection of Oceana text (a dictionary would work too, I suppose, but could give undue weight to some patterns if they commonly occur among uncommon words), could you point me towards it? A RMACHEDES  19:50, November 21, 2012 (UTC)
 * - (though not the last few lines, they are still in English) - . --O u WTBsjrief-mich 19:58, November 21, 2012 (UTC)
 * Cool, that last one ("Thie Evangelias o'that Biblia") should do the trick. I'll write that program and be right back. A RMACHEDES  20:31, November 21, 2012 (UTC)

Oceana analysis
Ok, so here's a table of what letters come after which in Oceana. The probabilities increment (so that the last value, the probability that a space precedes some letter, is 1), because I wanted to get my program to generate a few Oceana-like words and having the probabilities like that makes it easier to do so. The numbers don't mean all to much to a human, I'm afraid.

| a   | b    | c    | d    | e    | f    | g    | h    | i    | j    | k    | l    | m    | n    | o    | p    | q    | r    | s    | t    | u    | v    | w    | x    | y    | z    |      | a | 0.00 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 | 0.10 | 0.12 | 0.12 | 0.14 | 0.20 | 0.24 | 0.44 | 0.44 | 0.44 | 0.44 | 0.48 | 0.53 | 0.82 | 0.82 | 0.84 | 0.86 | 0.86 | 0.87 | 0.89 | 1.00 | b | 0.07 | 0.07 | 0.07 | 0.07 | 0.31 | 0.31 | 0.31 | 0.31 | 0.51 | 0.51 | 0.51 | 0.55 | 0.55 | 0.55 | 0.58 | 0.58 | 0.58 | 0.66 | 0.66 | 0.66 | 0.88 | 0.88 | 0.88 | 0.88 | 0.98 | 0.98 | 1.00 | c | 0.27 | 0.27 | 0.27 | 0.27 | 0.31 | 0.31 | 0.31 | 0.70 | 0.70 | 0.70 | 0.79 | 0.80 | 0.80 | 0.80 | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 1.00 | d | 0.14 | 0.14 | 0.14 | 0.18 | 0.40 | 0.40 | 0.40 | 0.40 | 0.48 | 0.48 | 0.48 | 0.50 | 0.50 | 0.52 | 0.68 | 0.68 | 0.68 | 0.69 | 0.73 | 0.73 | 0.75 | 0.75 | 0.76 | 0.76 | 0.76 | 0.77 | 1.00 | e | 0.03 | 0.05 | 0.06 | 0.10 | 0.12 | 0.13 | 0.13 | 0.14 | 0.14 | 0.14 | 0.15 | 0.19 | 0.25 | 0.36 | 0.36 | 0.37 | 0.37 | 0.44 | 0.50 | 0.59 | 0.59 | 0.59 | 0.61 | 0.61 | 0.62 | 0.62 | 1.00 | f | 0.03 | 0.03 | 0.03 | 0.03 | 0.16 | 0.25 | 0.25 | 0.25 | 0.30 | 0.30 | 0.30 | 0.34 | 0.34 | 0.34 | 0.69 | 0.69 | 0.69 | 0.73 | 0.74 | 0.84 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 1.00 | g | 0.10 | 0.10 | 0.10 | 0.10 | 0.17 | 0.17 | 0.17 | 0.22 | 0.29 | 0.29 | 0.29 | 0.41 | 0.42 | 0.43 | 0.78 | 0.78 | 0.78 | 0.83 | 0.83 | 0.84 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 0.88 | 1.00 | h | 0.26 | 0.26 | 0.26 | 0.26 | 0.44 | 0.44 | 0.44 | 0.44 | 0.64 | 0.64 | 0.65 | 0.66 | 0.66 | 0.66 | 0.77 | 0.77 | 0.77 | 0.78 | 0.78 | 0.79 | 0.89 | 0.89 | 0.89 | 0.89 | 0.89 | 0.89 | 1.00 | i | 0.04 | 0.05 | 0.06 | 0.08 | 0.24 | 0.25 | 0.30 | 0.30 | 0.30 | 0.31 | 0.32 | 0.35 | 0.39 | 0.65 | 0.65 | 0.66 | 0.66 | 0.67 | 0.77 | 0.87 | 0.87 | 0.90 | 0.90 | 0.90 | 0.90 | 0.90 | 1.00 | j | 0.10 | 0.10 | 0.10 | 0.10 | 0.33 | 0.33 | 0.33 | 0.33 | 0.43 | 0.43 | 0.43 | 0.43 | 0.43 | 0.43 | 0.70 | 0.70 | 0.70 | 0.70 | 0.74 | 0.74 | 0.87 | 0.87 | 0.87 | 0.87 | 0.87 | 0.87 | 1.00 | k | 0.11 | 0.11 | 0.11 | 0.11 | 0.24 | 0.24 | 0.24 | 0.24 | 0.32 | 0.32 | 0.32 | 0.37 | 0.37 | 0.39 | 0.48 | 0.48 | 0.48 | 0.57 | 0.61 | 0.74 | 0.83 | 0.83 | 0.83 | 0.83 | 0.88 | 0.88 | 1.00 | l | 0.20 | 0.20 | 0.20 | 0.22 | 0.42 | 0.42 | 0.42 | 0.42 | 0.50 | 0.50 | 0.50 | 0.60 | 0.62 | 0.62 | 0.80 | 0.81 | 0.81 | 0.81 | 0.83 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 1.00 | m | 0.11 | 0.11 | 0.11 | 0.11 | 0.30 | 0.30 | 0.30 | 0.30 | 0.37 | 0.37 | 0.37 | 0.38 | 0.41 | 0.42 | 0.50 | 0.51 | 0.51 | 0.51 | 0.52 | 0.52 | 0.56 | 0.56 | 0.56 | 0.56 | 0.58 | 0.58 | 1.00 | n | 0.10 | 0.10 | 0.10 | 0.14 | 0.33 | 0.33 | 0.38 | 0.38 | 0.43 | 0.45 | 0.46 | 0.46 | 0.46 | 0.47 | 0.48 | 0.48 | 0.48 | 0.48 | 0.51 | 0.54 | 0.55 | 0.55 | 0.55 | 0.55 | 0.55 | 0.55 | 1.00 | o | 0.01 | 0.03 | 0.04 | 0.11 | 0.11 | 0.13 | 0.14 | 0.14 | 0.15 | 0.15 | 0.17 | 0.20 | 0.25 | 0.30 | 0.31 | 0.32 | 0.32 | 0.37 | 0.48 | 0.58 | 0.61 | 0.69 | 0.80 | 0.80 | 0.80 | 0.81 | 1.00 | p | 0.06 | 0.06 | 0.06 | 0.06 | 0.20 | 0.20 | 0.20 | 0.24 | 0.27 | 0.27 | 0.27 | 0.29 | 0.29 | 0.30 | 0.63 | 0.65 | 0.65 | 0.88 | 0.89 | 0.92 | 0.96 | 0.96 | 0.96 | 0.96 | 0.96 | 0.96 | 1.00 | q | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | r | 0.16 | 0.16 | 0.16 | 0.21 | 0.34 | 0.35 | 0.36 | 0.37 | 0.49 | 0.49 | 0.49 | 0.49 | 0.49 | 0.50 | 0.64 | 0.65 | 0.65 | 0.66 | 0.71 | 0.75 | 0.78 | 0.78 | 0.78 | 0.78 | 0.80 | 0.80 | 1.00 | s | 0.04 | 0.04 | 0.04 | 0.04 | 0.18 | 0.18 | 0.18 | 0.37 | 0.42 | 0.42 | 0.44 | 0.45 | 0.46 | 0.46 | 0.51 | 0.52 | 0.52 | 0.53 | 0.55 | 0.66 | 0.66 | 0.66 | 0.68 | 0.68 | 0.70 | 0.70 | 1.00 | t | 0.03 | 0.03 | 0.05 | 0.05 | 0.15 | 0.15 | 0.15 | 0.52 | 0.54 | 0.54 | 0.54 | 0.55 | 0.55 | 0.56 | 0.59 | 0.59 | 0.59 | 0.60 | 0.67 | 0.72 | 0.72 | 0.72 | 0.73 | 0.73 | 0.73 | 0.73 | 1.00 | u | 0.00 | 0.02 | 0.02 | 0.13 | 0.35 | 0.35 | 0.35 | 0.35 | 0.35 | 0.35 | 0.36 | 0.38 | 0.41 | 0.50 | 0.50 | 0.51 | 0.51 | 0.57 | 0.66 | 0.72 | 0.72 | 0.72 | 0.72 | 0.72 | 0.72 | 0.73 | 1.00 | v | 0.34 | 0.34 | 0.34 | 0.34 | 0.67 | 0.67 | 0.67 | 0.67 | 0.85 | 0.85 | 0.85 | 0.86 | 0.86 | 0.86 | 0.93 | 0.93 | 0.93 | 0.99 | 0.99 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | w | 0.13 | 0.13 | 0.13 | 0.14 | 0.26 | 0.26 | 0.29 | 0.29 | 0.30 | 0.30 | 0.35 | 0.35 | 0.35 | 0.39 | 0.40 | 0.40 | 0.40 | 0.44 | 0.57 | 0.60 | 0.61 | 0.61 | 0.61 | 0.61 | 0.61 | 0.61 | 1.00 | x | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | y | 0.16 | 0.16 | 0.16 | 0.20 | 0.34 | 0.34 | 0.34 | 0.34 | 0.39 | 0.39 | 0.39 | 0.44 | 0.45 | 0.54 | 0.54 | 0.54 | 0.54 | 0.60 | 0.60 | 0.63 | 0.63 | 0.67 | 0.67 | 0.67 | 0.67 | 0.71 | 1.00 | z | 0.23 | 0.23 | 0.23 | 0.34 | 0.47 | 0.47 | 0.47 | 0.48 | 0.52 | 0.52 | 0.52 | 0.57 | 0.58 | 0.58 | 0.69 | 0.69 | 0.69 | 0.71 | 0.71 | 0.71 | 0.89 | 0.89 | 0.91 | 0.91 | 0.91 | 0.93 | 1.00 | | 0.08 | 0.15 | 0.17 | 0.22 | 0.25 | 0.27 | 0.29 | 0.35 | 0.39 | 0.40 | 0.45 | 0.48 | 0.51 | 0.54 | 0.61 | 0.63 | 0.63 | 0.65 | 0.70 | 0.90 | 0.91 | 0.92 | 0.93 | 0.93 | 0.94 | 0.95 | 1.00 |

Yeah it's a little long. :P Anyhow, I got the program to print out some fake Oceana for me. Tell me if it's reasonable (note that it's normal that 80% of these are obviously crap-- random numbers cannot reproduce actual language too well). Btw, the technique I'm using is called Markov chains, and it's a fairly common idea to use these to generate natural language-like text. Except most of the time people use consecutive words, and it's to generate spam.

>> Generating 100 words, then filtering out words where length < 3:

ellos ado puelseate ldal thuero tevepon eds hothuers igrathnahane biet ana athu oves tha yat kil han thi dothi the otabie brear tlatl tovigushabinenene hin ath broviebudenine modedalanen thu henarin prse aratide ssh thin kun aish den opho yie chlahetem jede ckud theve foshiniat heren owraradanen ithelo hat inere aneerow rca tshinabyemie anenshe syedd jis

What do you think? A RMACHEDES 20:57, November 21, 2012 (UTC)

I'll leave it to Oos to comment on whether the words are realistic or not. :P I will say, though, that the program in some cases has made some (understandable) mistakes; in particular, there are certain consonant combinations that are limited to certain places in a word - hence 'ckud' and 'rca' are illegal because 'ck' and 'rc' are cannot appear at the start or end of a word respectively. But yeah, some of the words you generated are real Oceana words, so nice work I guess. :P --Semyon 21:11, November 21, 2012 (UTC)
 * Thanks. This technique works really well in some languages, not so much in others. In Spanish, with the occasional error due to three "l"s in a row, the results are truly impeccable. A RMACHEDES  21:22, November 21, 2012 (UTC)

Oh, and the letter distribution: A RMACHEDES  21:22, November 21, 2012 (UTC)



I'm in semi-progress of making a conlang - grammar is mostly Spanish or English, but the words/character make it stand out. 77topaz (talk) 22:40, November 21, 2012 (UTC)

Very interesting! Oceana is fairly close to English, with "a", "h", "k", and "t" being slightly more common. The words indeed are mostly crappy, but apart from some obvious nonsense "tlatl", "rca", "syedd", it's quite reasonable :) --O u WTBsjrief-mich 05:21, November 22, 2012 (UTC)

"Tlatl" reminds me of the Zhodani language from "Traveller". 77topaz (talk) 07:05, November 22, 2012 (UTC)
 * "tlatl" is the Nahuatl word for "sling" :) --O u WTBsjrief-mich 10:56, November 23, 2012 (UTC)
 * Do you speak Nahuatl? :O --Semyon 10:59, November 23, 2012 (UTC)
 * Not fluently :P --O u WTBsjrief-mich 11:00, November 23, 2012 (UTC)

Photo
Can you find a photo of a young actor so I can use it for one of my characters? I cannot find a good photo.  Happy65   Talk CNP   12:25, November 24, 2012 (UTC)
 * Have you tried Wikimedia Commons? --O u WTBsjrief-mich 12:27, November 24, 2012 (UTC)


 * Still cannot find a good photo.  Happy65   Talk CNP  LogoCNP.png 12:28, November 24, 2012 (UTC)
 * Perhaps you shouldn't look for "actor", but for something else like "young man" etc. If the image is fine, nobody'll have to know it ain't an actor in real life :) --O u WTBsjrief-mich 12:30, November 24, 2012 (UTC)


 * I cannot, the good ones I find are all No copying copyright thingy.  Happy65   Talk CNP  LogoCNP.png 12:32, November 24, 2012 (UTC)
 * Nobody cares. --O u WTBsjrief-mich 12:33, November 24, 2012 (UTC)


 * I care, I don't want to be in? The Hague? in court.  Happy65   Talk CNP LogoCNP.png 12:35, November 24, 2012 (UTC)
 * Okay, then you'd have to search further or not include a picture at all :P --O u WTBsjrief-mich 12:36, November 24, 2012 (UTC)

Hello
Hello Oos Wes Ilava! I am Dave Leskromento of the Peace Island Times and would like to suggest to you an important article on the Peace Island Times. It is the newest article. Thanks.  Happy65   Talk CNP  16:55, December 3, 2012 (UTC)
 * You made me curious, I'll read it right away :P --O u WTBsjrief-mich 17:23, December 3, 2012 (UTC)

Hey Oos, I'm going to redo the economic statistics I forgot to take into account the relative strength/weakness of the different state economies in different sectors. Hoffmann Kunarian TALK 16:06, December 5, 2012 (UTC)
 * Okay. If you need help, just ask. --O u WTBsjrief-mich 16:08, December 5, 2012 (UTC)

IP blocked
For some reason ("spam") they cross-wiki blocked my IP or account... I hope this'll get fixed soon, until then, I'll just have to wait :P --O u WTBsjrief-mich 05:20, December 6, 2012 (UTC)
 * Outrageous! --Semyon 13:32, December 6, 2012 (UTC)
 * It should be fixed now. Wikia accidently blocked all users of one of their servers :P --O u WTBsjrief-mich 16:56, December 6, 2012 (UTC)
 * well they have to mess up once in awhile. :P Marcus/Michael Villanova 22:34, December 6, 2012 (UTC)
 * And by 'once in a while' we mean 'all the time.' *waits for block* --Semyon 22:46, December 6, 2012 (UTC)