Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lexical Distance Among the Languages of Europe (elms.wordpress.com)
137 points by fforflo on Feb 1, 2016 | hide | past | favorite | 48 comments


It's a pity that the Basque language is not in this graph.

From wikipedia [1]:

> The impossibility of linking Basque with its Indo-European neighbors in Europe has inspired many scholars to search for its possible relatives elsewhere. Besides many pseudoscientific comparisons, the appearance of long-range linguistics gave rise to several attempts to connect Basque with geographically very distant language families. All hypotheses on the origin of Basque are controversial, and the suggested evidence is not generally accepted by most linguists.

[1] https://en.wikipedia.org/wiki/Basque_language


Or Yiddish and its variants.


This is for entertainment value only. There are many ways to get ratings of language similarity (see https://en.wikipedia.org/wiki/Quantitative_comparative_lingu... for an overview) and this graphic is from a paper published before Bayesian phylogenetic methods took over the field (I can't read the original Russian though).

For more recent research, see the second two figures at http://language.cs.auckland.ac.nz/media-material/


Would you say that more recent methods allow us to talk about degree of language similarity in a way that is objective enough and meaningful/useful enough that the results are not "for entertainment value only"?


"Objective enough" isn't the issue, rather it's how confident we are in the inferences. And yes, I would say that the new methods are preferable because 1) they do a better job of explicitly quantifying and representing uncertainty 2) they use more sophisticated source data/features.


This is for entertainment value only.

I disagree. The charts you linked to are quite interesting and go much deeper. They're also harder to parse, both visually and conceptually.

But (for the unacquainted) the original chart tells you right away -- bang! -- how the main linguistic groups in Europe relate to each other, to a first-order approximation (give or take a few regrettable omissions).

Keep in mind that many people are still very ignorant of even the basic overlaps between these families (like that English and German are actually quite close, for example). And this has a lot to with why they think that learning another language (as an adult) is much more difficult than it really is.


And then there's Hungarian, my ancestors, way off on the side, where no one really knows where it came from.


There seems to be some agreement that Hungarian is one language of the Ugro-Finnic family together with many others: https://en.wikipedia.org/wiki/Finno-Ugric_languages but you're right about unclear geographical origins.


Current theories say it originated from the language spoken between the Black Sea and the Ural mountains [1] and was probably brought to current East Europe during a migration.

https://en.wikipedia.org/wiki/Finno-Ugric_languages#Origins


See also "the directed graph of stereotypical incomprehensibility" http://languagelog.ldc.upenn.edu/nll/?p=1024


No connection between Greek and Italian doesn't seem right. Many Italian words are influenced by Greek, to the point that antique Greek is studied in some schools.


And vice-versa: modern Greek has borrowed quite a few words from Italian, e.g. Porta (Door).

The Greek-Dutch connection is a surprising one as well, I'm not sure what they're basing it on.


Yeah, a full graph would have been nice. Maybe one where you can click a language and it rearranges centred on that.


Graphics is useful, but the underlying data is imprecise: it shows that Serbian is closer to Russian than Belorussian, which is just wrong.


Is the reason English is closer to Danish because the Anglo-Saxons spoke something like that or is it because of Vikings conquering parts of England for a while?

https://en.wikipedia.org/wiki/Danelaw


The Saxons, Angles and Jutes that invaded after the fall of Rome all generally came from Denmark[1].

Most of the Norse who subsequently invaded in the 700-1000 were either Danes or Norwegians[2], and settled in pretty large numbers in East Anglia, Northumbria and Mercia.

Then the Normans, who were in large part Danes and Norwegians that had gotten bought off with a piece of France came over in 1066.

I don't know how accurate this is, but in Bernard Cornwell's Uhtred series, English and Danish are different enough that they aren't really mutually intelligible.

[1] https://en.wikipedia.org/wiki/Angles#/media/File:Angles_saxo...

[2] https://en.wikipedia.org/wiki/Great_Heathen_Army


Probably a bit of both. 'Old English' is probably closer to Viking-era Danish than, say, modern German and of course the Vikings also did conquer parts of England.


Vikings and also the Norse who had settled in Normandy and had adopted lots of old French vocab but retained some ancestral vocabulary when they invaded and subjugated the by then indigenous Anglo-Saxons.

Also north east England was under Danish influence for quite a while.


I can't remember where, but I read there is a theory that English originated as a creole of Anglo-Saxon and Old Norse. The two languages had similar vocabulary (e.g. AS scirt -- modern English shirt -- vs ON skirt, both originally referring to a unisex knee-length tunic), but different inflections: the vocabulary was kept but much of the inflection was dropped, which is why English is morphologically so much simpler than, say, German.


looks like a good oppertunity to plug one of my favorite podcasts, The History of English Podcast http://historyofenglishpodcast.com/, if you're curious about history of the language and its indo-european cousins its fascinating.


I think it is because of the latter. The Anglo-Saxons, while coming from an area that coincides with modern day Denmark, did not speak (an ancient form of) Danish.


It's so funny that Spanish-Catalan-Italian are strong but Spanish-Italian are weaker. As a Valencian (we also speak Catalan) I understand most of the Italian I hear, but for most people in Spain it's not so easy


I read somewhere that English is quite rare in that it doesn't have at least one mutually intelligible near relative. If true, the diagram doesn't reflect that.


The Scots language is a good counterpoint - they aren't just speaking "English" with an accent and the odd bits of different vocabulary such as the word "outwith".

And Frisian comes close, e.g. "Bread, butter and green cheese is good English and good Fries", which sounds not very different from "Brea, bûter en griene tsiis is goed Ingelsk en goed Frysk".

https://en.wikipedia.org/wiki/West_Frisian_language


That seems like a weird assertion on the part of whatever you read, because 1) it isn't clear what constitutes a separate, well-defined language (see https://en.wikipedia.org/wiki/Dialect_continuum) and 2) mutual intelligibility is a gradient, not a binary.

Many speakers of Standard American English have to turn on subtitles to understand what's going on in movies from the UK (like Trainspotting). So maybe we call a great many things English?


For what it's worth, I think I recall where I read it. "The Power of Babel" a popular science linguistics book by John McWhorter. I'll see if I can dig up the quote later and report back......


Scots, Jamaican patois, Hawaiaan Pidgin, Tok Pisin


I don't think it's true. Depends on what you mean by rare, I guess.


I can't think of a fourth big Scandinavian language that abbreviates to BOK. Swedish, Norwegian, Danish .... ?


Norwegian Bokmål, a written form of Norwegian.

https://en.m.wikipedia.org/wiki/Bokm%C3%A5l


So "NN" is presumably Nynorsk, then.

https://en.wikipedia.org/wiki/Nynorsk


Many thanks


Posted by Teresa Elms on 4 March 2008.


yes.


I don't think there should be a link between Slovene and Albanian.


I don't know either language (they're distantly related), but I'm guessing they might have borrowed words from each other. Swedish and Finnish are completely unrelated, and are also linked, presumably for the same reason.


If Albanian should be connected to any Slavic languages, though, it should be connected with Serbian, Macedonian, and Bulgarian, the languages that it neighbors, and which have exerted mutual cultural influence on each other. The point the commenter above was making is that choosing Slovenian specifically to be the linking language to Albanian in the sprachbund is very arbitrary.


https://en.wikipedia.org/wiki/Sprachbund

A sprachbund (/ˈsprɑːkbʊnd/; German: [ˈʃpʁaːxbʊnt], "federation of languages") – also known as a linguistic area, area of linguistic convergence, diffusion area or language crossroads – is a group of languages that have common features resulting from geographical proximity and language contact. They may be genetically unrelated, or only distantly related. Where genetic affiliations are unclear, the sprachbund characteristics might give a false appearance of relatedness. Areal features are common features of a group of languages in a sprachbund.


Interesting that Greek is so isolated, even though so many European languages are derived from it. Maybe it's because Greek was essentially a dead language that was reintroduced, so other languages had drifted far from it?


> Maybe it's because Greek was essentially a dead language that was reintroduced

How is that? Haven't they continued to speak it in Greece all these years? Granted, modern Greek is different than the classic version of the language, but that's to be expected over thousands of years...


But what languages are derived from Greek, as opposed to having a lot of vocabulary derived from it?


Maltese was excluded too, and falls into a similar position of being unrelated genetically but having imported a lot of vocabulary through Italian and English. It started as an Arabic dialect, but has a very high loanword rate, so a simple sentence is indistinguishable from Arabic, but a complex sentence is easy to understand for a speaker of a Romance language.

It's stuff like that which can't be drawn in a chart like this.


No languages are closely related to Greek except for other varieties of Greek. It's on its own branch of the Indo-European family. English, like all other Indo-European languages from Irish to Sinhalese, is a distant relative. Also, Greek has been continuously spoken since classical times.


Not sure that very many european languages are derived from it in any form other than it's alphabet (and even then its fairly removed).

I could be wrong (this is well outside my area of expertise), but I think maybe you're thinking of Latin?


He's probably getting confused by the fact that a lot of technical vocabulary is borrowed from (ancient) Greek.


I guess I did assume that since there are quite a few Greek roots in English, that there was more direct influence.


No, you have to look at the core vocabulary. For example, most English words are derived from Latin (often through French), but all its core vocabulary (except for one or two words such as "very") is Germanic. So English is a Germanic language. Its earliest written form is Anglo-Saxon, the language of King Alfred.

Greek words in English are rarer, and are generally not core vocabulary, e.g. "telescope", "bacteria", "paragraph", "angel".

There are Greek cognates, which found their way into both English and Greek from their common ancestor Indo-European. These are harder to recognize. Examples include ἔργον which is cognate with "work", and πέτομαι/πτερόν (I fly/wing) which are cognate with "feather".


It's fun to read everyday signs in Greece without understanding the language. The words are familiar yet the meaning is lost in translation.

For example, I saw a truck labelled "ethniki metafora". Pretty sure it doesn't literally mean "ethnic metaphor" though! (I think the proper translation would be "national transport".)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: