You really think that in decades of linguists studying Linear A, no one has thought of trying Zipf's law?
If scientists have studied something for this long, and you come up with an idea that fits in a single paragraph, it's probably been tried and didn't work. Unless you're the field's leading expert in which case you would be off doing it, not posting it on HN :)
Neither of these areas are my field, so I could be entirely misunderstanding this preprint [1] from 2021. The preprint mentions using Zipf's law in the objectives section on attempting to deciphering Linear A.
The literature survey section mentions there have been good results using computational methods in 2020 to automatically decipher Linear B. The discussion section mentions "To the best of our knowledge, this is the first study to discuss and show computational analysis of Linear A."
Again, neither of these are my fields, but it looks like if these linguists have tried to use Zipf's law or other computational methods unsuccessfully in deciphering Linear A, the results weren't published. (Or a poor literature survey, or other explanations...) I'm not an academic either, so I don't know what the practices are for publishing unsuccessful results.
That looks like an extremely underwhelming bit of research.
Some thoughts:
1. Figure 2 is titled "Evolutionary Tree of ancient scripts reconstructed using Neighbor Joining algorithm in ClustalW2 (Daggumati et al., 2019)", however checking the references (a) that figure does not appear in the cited paper (b) the cited paper does not use ClustalW2 (c) the scripts in the cited paper do not match those in the figure.
2. Speaking of references, they include Classen, M., & Safrany, L. (1975). Endoscopic papillotomy and removal of gall stones. This seems like an odd choice. Why did they include this? I have no idea, I can't find any cite to it in the text.
3. Immediately after "When text is rendered by a computer, the characters in the text can not all be displayed, because no font that supports them is available to the computer" they write: "The Python package Matplotlib also does not render the font as of the current version 3.3.4." I would have thought that rendering with matplotlib would have counted as "render[ing] by a computer". Please also note Figure 5, which has a screenshot of a python script and a matplotlib plot; the former displays 5 (presumed) Linear A characters correctly while the latter has only boxes. Perhaps they meant 'there is no font in which all Linear A characters are present'. (And I should point out that in less than sixty seconds of searching I discovered that--as you might expect--matplotlib can use user-specified .ttf fonts)
(Sorry I know this point is long but another thing bothering me: why bother sticking with unicode codepoints when they could have instead just assigned integers to each character?)
4. For reasons that escape me, they chose to use a word cloud in Figure 15 ("Word cloud of locations the Linear A tablets were found by the number of symbols gathered from each") instead of, for example, a table. Why is 'HT' in there at least twice? Why is 'KH' in there at least six times?
I could go on but I'll stop here and say that if I were reviewing this for a journal it would be an immediate rejection.
Interesting. If they have only very recently tried Zipf's law then there may be some other more advanced stuff they haven't tried.
I'm thinking word embeddings. Like maybe you could do a word embedding based on cooccurence and look for similarly shaped clusters in Linear A and Early Greek.
I understand the impulse to point out the obvious, but when the question is asked honestly rather than arrogantly or dismissively, it is even better to wait for someone to provide the specific answer; in this case, the reason that Zipf's law is of no help.
It wasn't my intent to be overly dismissive. But I see this sort of thing all the time, and I find this phenomenon interesting, so I wanted to engage with that aspect of it, specifically.
I agree with you in general though. Dismissing these things out of hand isn't helpful either. But multiple people had already made substantive replies to the actual content of their idea, anyway.
useless dismissal, I made a question not an affirmation. Besides it allow for an exploration of the search space of solutions, which stimulate the depth of the discussion and might allow finer grained questions that would then become possibly innovative
edit: I hope this will make you think twice next time. Using zipf law for linear A has only been attempted for the first time in 2021 https://hal.archives-ouvertes.fr/hal-03207615/document
so had I commented last year it would have been prio art.
I agree the idea is not very original and yet we had to wait that much time for it to be experimented.
If scientists have studied something for this long, and you come up with an idea that fits in a single paragraph, it's probably been tried and didn't work. Unless you're the field's leading expert in which case you would be off doing it, not posting it on HN :)
Edit: typos