this post was submitted on 08 Sep 2024
1219 points (98.3% liked)
Programmer Humor
32476 readers
474 users here now
Post funny things about programming here! (Or just rant about your favourite programming language.)
Rules:
- Posts must be relevant to programming, programmers, or computer science.
- No NSFW content.
- Jokes must be in good taste. No hate speech, bigotry, etc.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Sounds like it’s actually using XSLT or some kind of content validation. Which to be honest sounds like a good practice.
Here's an example of a text object taken from the XML, if you're curious: https://clips.clb92.xyz/2024-09-08_22-27-04_gfxTWDQt13RMnTIS.png
EDIT: And with more complicated strings (like ones havingnumbers or symbols - just regular-ass ASCII symbols, mind you) there will be tens of , because apparently numbers and letters don't even work the same. Even line breaks have their own . And if the number of these and their charLen don't match what's actually in pt:data, it won't open the file.
Is it because of the lower case Latin æ since it’s technically one character even if two bytes?
Nope, doesn't seem like it.
What a mess… sounds like the devs got burned by various Unicode edge cases RTL, etc