More

BoringTimesGang · on Nov 8, 2024

If you are waiting for studies that perfectly model every variable before you spend 5 minutes walking, you are unlikely to be satisfied within your rapidly decreasing lifespan.

BoringTimesGang · on Oct 24, 2024

It's not a 'Japan thing'. I pay a higher rate on my mortgage because my spouse was not a citizen of where we live when we took it out. There are fewer providers willing to offer mortgages in this situation but, presumably, there's still enough of a price incentive that the premium isn't pulled out of thin air.

It's also common for landlords to ask for higher deposits or months paid up-front.

esperent · on Oct 24, 2024

It's also possible that the banks have less legal requirements to non-citizens so they make up a bullshit reason for charging you higher. Which is absolutely something a bank would do.

BoringTimesGang · on Oct 23, 2024

This is how I got MSI to honour their warranty in spite of their stance that any failure at all is due to user error, since their products don't fail

nick__m · on Oct 23, 2024

I have the opposite experience with warranty.

I had a defective ATX psu cable and MSI support sent me a whole cables kit overnight. And recently a bought a Corsair case, the iCue controller had 2 defective ports and Corsair also sent me a replacement overnight.

My only "trick" with support is telling them upfront that I will leave a 5 stars review on amazon uppon successful resolution of the problem.

dataflow · on Oct 23, 2024

Wow, nice. Did they show up? Did they settle?

BoringTimesGang · on Oct 24, 2024

Settled at the eleventh hour

BoringTimesGang · on Oct 18, 2024

Ah, I hope nobody ever uses that additional bit for additional encoding. That could cause all kinds of incompatibilities...

BoringTimesGang · on Oct 18, 2024

This is such an odd thing to read & compare to how eager my colleagues are to upgrade the compiler to take advantage of new features. There's so much less need to specify types in situations where the information is implicitly available after C++ 20/17. So many boost libraries have been replaced by superior std versions.

And this has happened again and again on this enormous codebase that started before it was even called 'C++'.

BoringTimesGang · on Oct 18, 2024

>haven't followed closely

Don't worry, most people complaining about C++ complexity don't.

harry8 · on Oct 23, 2024

Hahaha, you're including Bjarne in that sweeping generalization? C++ has long had a culture problem revolving around arrogance an belittling others, maybe it is growing out of it?

I would point out that for any language, if one has to follow the standards committee closely to be an effective programmer in that language, complexity is likely to be an issue. Fortunately in this case it probably isn't required.

I see garbage collection came in c++11 and has now gone. Would following that debacle make many or most c++ programmers more effective?

BoringTimesGang · on Oct 8, 2024

Because human language is hard to boil down to a simple computing model and the problem is underdefined, based on naive assumptions.

Or perhaps I should say naïve.

cm2187 · on Oct 8, 2024

Well pretty much every other more recent language solved that problem.

kccqzy · on Oct 8, 2024

Almost no programming language, perhaps other than Swift, solved that problem. Just use the article's examples as test cases. It's just as wrong as the C++ version in the article, except it's wrong with nicer syntax.

zahlman · on Oct 8, 2024

Python's strings have uppercase, lowercase and case-folding methods that don't choke on this. They don't use UTF-16 internally (they can use UCS-2 for strings whose code points will fit in that range; while a string might store code points from the surrogate-pair range, they're never interpreted as surrogate pairs, but instead as an error encoding so that e.g. invalid UTF-8 can be round-tripped) so they're never worried about surrogate pairs, and it knows a few things about localized text casing:

    >>> 'ß'.upper()
    'SS'
    >>> 'ß'.lower()
    'ß'
    >>> 'ß'.casefold()
    'ss'

There are a lot of really complicated tasks for Unicode strings. String casing isn't really one of them.

(No, Python can't turn 'SS' back into 'ß'. But doing that requires metadata about language that a string simply doesn't represent.)

crote · on Oct 9, 2024

But that's wrong. The uppercase for "in Maßen" ("in moderate amounts") is not "IN MASSEN" ("in Massen", meaning "in massive amounts").

kccqzy · on Oct 8, 2024

Still breaks on, for example, Turkish i vs İ. It's impossible to do correctly without language information.

> (No, Python can't turn 'SS' back into 'ß'. But doing that requires metadata about language that a string simply doesn't represent.)

Yes that's my point. Because in typical languages strings don't store language metadata, this is impossible to do correctly in general.

zahlman · on Oct 8, 2024

I'm not seeing anything in the Swift documentation about strings carrying language metadata, either, though?

kccqzy · on Oct 8, 2024

This lowercase function takes a locale argument https://developer.apple.com/documentation/foundation/nsstrin...

It looks like an old NSString method that's available in both Obj-C and Swift.

The casefold function is even older than that. https://developer.apple.com/documentation/foundation/nsstrin... Its documentation specifically includes a discussion of the Turkish İ/I issue.

tedunangst · on Oct 8, 2024

But that's wrong. The upper case for ß is ẞ.

cm2187 · on Oct 8, 2024

C#'s "ToUpper" takes an optional CultureInfo argument if you want to play around with how to treat different languages. Again, solved problem decades ago.

account42 · on Oct 9, 2024

This is not a locale issue, it's a Unicode version issue. Which hightlights another problem with adding this to the base standard library.

IncreasePosts · on Oct 8, 2024

That was only adopted in Germany like 7 years ago!

kccqzy · on Oct 8, 2024

Well languages and conventions change. The € sign was added not that long ago and it was somewhat painful. The Chinese language uses a single character to refer to chemical elements so when IUPAC names new elements they will invent new characters. Etc.

extraduder_ire · on Oct 9, 2024

Does unicode have space set aside for those new symbols to slot into? I know it's very rare, but it could get messy.

account42 · on Oct 9, 2024

Unicode is already messy. Chinese characters especially so due to han unificiation.

Towaway69 · on Oct 9, 2024

Isn't uppercase for ß just ß - i.e. it's its own uppercase character?

bratwurst3000 · on Oct 9, 2024

there shouldn’t be an uppercase version of ß because there is no word in the german language that uses it as the first letter. the german language didnt think of allcaps. please correct me if I am wrong. If written in uppercase it should be converted to SZ or the new uppercase ß…. which my iphone doesn’t have… and converting anything to uppercase SS isn’t something germany wants …

account42 · on Oct 9, 2024

> there shouldn’t be an uppercase version of ß because there is no word in the german language that uses it as the first letter. the german language didnt think of allcaps.

Allcaps (and smallcaps) has always existed in signage everywhere. Before the computing age, letters where just arbitrary metal stamps -- and just whatever you could draw before that. Historically, language was not as standardized as it is today.

Towaway69 · on Oct 9, 2024

I don’t think that Germany wants a capital ß or the German language requires one rather technology needs one to dot the eyes and cross the tees.

account42 · on Oct 9, 2024

Not generally no, but some applications used it that way because of ambiguity of upppercasing ß to SS - which is why ẞ was added.

Towaway69 · on Oct 9, 2024

On the other hand, the German language has existed for several hundred years without having a capital ß but now it needs one?

True capitalisation has always existed but even that didn’t seem to have required a capital ß - why now?

tialaramex · on Oct 8, 2024

Rust will cheerfully:

    assert_eq!("ὀδυσσεύς", "ὈΔΥΣΣΕΎΣ".to_lowercase());

[Notice that this is in fact entirely impossible with the naive strategy since Greek cares about position of symbols]

Some of the latter examples aren't cases where a programming language or library should just "do the right thing" but cases of ambiguity where you need locale information to decide what's appropriate, which isn't "just as wrong as the C++ version" it's a whole other problem. It isn't wrong to capitalise A-acute as a capital A-acute, it's just not always appropriate depending on the locale.

account42 · on Oct 9, 2024

Is this

    assert_eq!("\u1F41δυσσεύς", "ὈΔΥΣΣΕΎΣ".to_lowercase());

or

    assert_eq!("\u03BF\u0314δυσσεύς", "ὈΔΥΣΣΕΎΣ".to_lowercase());

For display it doesn't matter but most other applications really want some kind of normalizatin which does much much more so having a convenient to_lowercase() doesn't buy you as much as you think and can be actively misleading.

MBCook · on Oct 8, 2024

So what?

That doesn’t prevent adding a new function that converts an entire string to upper or lowercase in a Unicode aware way.

What would be wrong with adding new correct functions to the standard library to make this easy? There are already namespaces in C++ so you don’t even have to worry about collisions.

That’s the problem I see. It’s fine if you have a history of stuff that’s not that great in hindsight. But what’s wrong with having a better standard library going forward?

It’s not like this is an esoteric thing.

wakawaka28 · on Oct 9, 2024

The reason that wasn't done is because Unicode is not really in older C++ standards. I think it may have been added to C++23 but I am not familiar with that. There are many partial solutions in older C++ but if you want to do it well then you need to get a library for it from somewhere, or else (possibly) wait for a new standard.

Unicode and character encodings are pretty esoteric. So are fonts. The stuff is technically everywhere and fundamental, but there are many encodings, technical details, etc. And most programmers only care about one language, or else only use UTF-8 with the most basic chars (the ones that agree with ASCII). That isn't terrible. You only need what you actually need. Most programs don't strictly have to be built for multiple random languages, and there is kind of a standard methodology to learn before you can do that.

BoringTimesGang · on Oct 8, 2024

Now double all of that effort, so you can get it to work with Windows' UTF-16 wstrings.

account42 · on Oct 9, 2024

Better to just convert WTF-16 (Windows filenames re not guaranteed to be valid UTF-16) to/from WTF-8 at the API boundary and then do the same processing internally on all platforms.

BoringTimesGang · on Oct 8, 2024

>It is issues like this due to which I gave up on C++. There are so many ways to do something and every way is freaking wrong!

These are mostly unicode or linguistics problems.

tralarpa · on Oct 8, 2024

The fact that the standard library works against you doesn't help (to_lower takes an int, but only kind of works (sometimes) correctly on unsigned char, and wchar_t is implicitly promoted to int).

BoringTimesGang · on Oct 8, 2024

to_lower is in the std namespace but is actually just part of the C89 standard, meaning it predates both UTF8 and UTF16. Is the alternative that it should be made unusable, and more existing code broken? A modern user has to include one of the c-prefix headers to use it, already hinting to them that 'here be dragons'.

But there are always dragons. It's strings. The mere assumption that they can be transformed int-by-int, irrespective of encoding, is wrong. As is the assumption that a sensible transformation to lower case without error handling exists.

account42 · on Oct 9, 2024

> Is the alternative that it should be made unusable, and more existing code broken?

It should be marked [[deprecated]], yes. There is no good reason to use std::tolower/toupper anywhere - they can neither do unicode properly nor are they anywhere close to efficient for ASCII. And their behavior depends on the process-global locale.

BoringTimesGang · on Sept 16, 2024

Is the misspelling (dallars) a reference I've missed?