From my other comment about o1 and o3/o4 potential issues:
The other big difference between o1 and o3 and o4 that may explain the higher rate of hallucinations is that the o1’s reasoning is not user accessible, and it’s purposefully trained to not have safe guards on reasoning. Where o3 and o4 have public reasoning and reasoning safeguards. I think safeguards may be a significant source of hallucination because they change prompt intent, encoding and output. So on a non-o1 model that safeguard process is happening twice per turn once for reasoning and once for output, then being accumulated into the input. On an o1 model that's happening once per turn only for output and then being accumulated.
China has been on the ball to curb publish or perish pop science bullshit for about 5 years now. They banned incentive and performance structures in universities that rely on publishing rankings.
https://www.mpiwg-berlin.mpg.de/observations/1/end-publish-or-perish-chinas-new-policy-research-evaluation
Despite this reform, China has maintained it's leadership in paper quality and quantity and is now leading in high-level SME's
https://www.scmp.com/news/china/science/article/3295011/china-surpasses-us-tally-top-scientists-first-time-report
It's more likely this is real than it was 5 years ago.