it's only considering Altaic and Indo-European languages.
Data is Beautiful
A place to share and discuss visual representations of data: Graphs, charts, maps, etc.
DataIsBeautiful is for visualizations that effectively convey information. Aesthetics are an important part of information visualization, but pretty pictures are not the sole aim of this subreddit.
A place to share and discuss visual representations of data: Graphs, charts, maps, etc.
A post must be (or contain) a qualifying data visualization.
Directly link to the original source article of the visualization
Original source article doesn't mean the original source image. Link to the full page of the source article as a link-type submission.
If you made the visualization yourself, tag it as [OC]
[OC] posts must state the data source(s) and tool(s) used in the first top-level comment on their submission.
DO NOT claim "[OC]" for diagrams that are not yours.
All diagrams must have at least one computer generated element.
No reposts of popular posts within 1 month.
Post titles must describe the data plainly without using sensationalized headlines. Clickbait posts will be removed.
Posts involving American Politics, or contentious topics in American media, are permissible only on Thursdays (ET).
Posts involving Personal Data are permissible only on Mondays (ET).
Please read through our FAQ if you are new to posting on DataIsBeautiful. Commenting Rules
Don't be intentionally rude, ever.
Comments should be constructive and related to the visual presented. Special attention is given to root-level comments.
Short comments and low effort replies are automatically removed.
Hate Speech and dogwhistling are not tolerated and will result in an immediate ban.
Personal attacks and rabble-rousing will be removed.
Moderators reserve discretion when issuing bans for inappropriate comments. Bans are also subject to you forfeiting all of your comments in this community.
Originally r/DataisBeautiful
So Thai is the current meta
Syllables can vary in length. Japanese has very short syllables while English has rather long ones. Counting phonemes would make more sense
Wonder how Thai is the zipfile of languages.
It is multiplexed with five tones and a variety of different registers to signify relationship, status, and variable interplay between the two based on situation.
- University Thai language learner, linguist, and professional Thai reading, writing, speaking in Thailand for several years
My very casual understanding is that grammatical structure or gender isn't really a thing, or articles for that matter, making it very contextual and tonal language so a zipfile isn't even a bad metaphor.
However, in this case it seems like the human brain is the default Windows zip program.
I am curious about Arabic. I feel like it should be having the highest information rate.
What makes you think that? I'm curious. I would've assumed something like Inuktitut (1 word conveys subject verb object tense ...) or something like toki pona (removes unused information) or maybe a highly analytical language like one of the Chinese languages.
I was comparing Arabic to other languages with the most speakers in the world. I have no idea what those languages you mentioned sound like. And I bet conlangs could be designed to fulfill such requirements as well.
Cowards left out Navajo.
It's long been suspected that Koreans are really fast with rhythm games and have high APM because of their language getting to the point faster.
As someone who speaks both French and English, I'm surprised to see French as leading "information density" language. Most French terms have been incorporated into English. Language tends to be behind on technology terms. Language doesn't have any noticeable difference in short syllable common words to English. It also seems to me that French speakers have an easier time in being vague. I have the impression that English is more precise.
Both were massive empires. Makes sense that imperialism would put selective pressure on language. Historically you're either limited in words by space on a paper or what can be easily repeated by messengers.
I feel like the multitude of tenses in French help with being more precise.
The tenses don't add precision, IMO. There is a plural them instead of him/her but it sounds the same as the singular him/her. There is a plural you that sounds different, but there is also a polite singular you that is the plural you.
I had the same feeling. I honestly just feel like English is a junk drawer of depth borrowing various languages, but maybe average speakers don't try to dig deep into it?
In most cases, being vague requires more informational transfer. To be vague but still connected to whatever is the signified, you need to give more information around the idea rather than simply stating the idea. Think about being vague about how you feel versus being blunt about it.
Looking at the two curves, it looks like they are pretty close but French edges out English because of the speed it's spoken at.
Even when it was fresh in my mind, I was never able to follow French tv because they just go so fast.
Yeah like "qu'est-ce que c'est ?" Which is just "what's that?" (I speak both too) would never have guessed French had more information encoded, french translations are always longer too (but you don't always pronounce all ofc).
I think this moreso demonstrates how tedious written french is. “Qu’est-ce que c’est?” is significantly faster to say than “what’s that?”
I’d wager if the chart was on information density per written letter or word french would be way further behind
Right, the spoken french could be written more or less as Kès-ke-cè.
I always thought that English was an efficient language.
Switch to Rust. I speak Rust btw.
On arch
Nah NixOS