Game developer claims Intel is selling defective CPUs
A game developer has published a report that claims Intel has been selling defective CPUs to users. Alderon Games says that some stability issues seem to be affecting Intel's 13th and 14th gen K/KF/KS chips.
What's wrong with Intel 13900k and 14900k CPUs?
This is not the first time that the Raptor Lake chipsets have been found to be unstable. A bulletin published by Unreal Engine's RAD Game Tools in April this year attributed Oodle decompression failures to the instability of Intel processors. The statement highlighted that the problem was due to high clock rates and power usage of the CPUs, i.e. overclocked processors.
The report by Matthew Cassells, the Founder of Alderon Games, reveals that several users who played the company's multiplayer dinosaur survival game, Path of Titans, ran into a lot of problems such as crashes, instability, and memory corruption. Alderon analyzed the crash reports and identified the failures in 5 areas.
Users were experiencing crashes on Intel 13th and 14 Gen CPUs while playing the game. The game's developers were also running into instability when they were working on the game or running benchmark tools. They say that the errors even resulted in SSD and memory corruption, which is shocking because it is kind of a rare issue. The developers also highlighted that the crashes affected their game's servers, crashing them completely.
Several reports on reddit indicate that gamers are running into out of video driver memory (VRAM) error messages, even though their PCs had sufficient memory. Players have also been reported about the problems they faced in other games, either related to the CPU or the GPU.
Experts say Intel has a big problem
Wendell Wilson posted a video on Level1Techs' YouTube channel that explains why Intel's 13900k's and 14900k's are crashing frequently. The analysis of two different telemetry databases (crash reports from 2 Unreal engine games), showed that the majority of issues were related to the Intel 13900k and 14900k. Level1Tech said that data centers which had these CPUs saw a 50% failure rate in a seven-day period.
The analysis indicated that the CPUs degraded over time, and this resulted in more and more errors. This is something that Alderon Games had also mentioned, that these CPUs had a nearly 100% failure rate in their tests, and it was only a matter of time before the affected CPUs could fail. Another video published by Gamers Nexus featuring Steve Burke and Wendell, explains more about the problem. The findings suggest that the only way to fix the instability issues could be by replacing the CPUs.
On a similar note, the developers of the popular game, Warframe, have also revealed (via Tom's Hardware) that Intel's K-series chips, specifically the Core i9 and Core i7 CPUs, are the major culprits in nearly 80% of crashes.
Intel had previously confirmed that it was investigating the issues reported by users and outlets, and has published some guidance. The company noted, in a support document, that some motherboards (BIOS) had been allowing the processor to run "at turbo frequencies and voltages even while the processor is at a high temperature." Intel is advising users to run their PCs at their stock settings, i.e. without overclocking them.
The chipset maker released a new Microcode and BIOS update, which was to fix the eTVB (enhanced Thermal Velocity Boost). But this was not identified as the root cause of these instability issues, and Intel is still investigating the problems. Intel is working with mother partners to roll out the patch for the eTVB bug, as part of BIOS updates, the updates should be released on July 19th.
Alderon says that Intel's updates have not fixed the problems so far, and that the company has swapped its servers to AMD CPUs, which the company claims is experiencing far fewer crashes than on the Intel CPUs. The developers are also notifying users who have these processors about the issue, and how they can fix it.
What could Intel do, if this is a hardware defect? Recall the affected CPUs and offer replacements?
Are you using an Intel 13th or 14th gen CPU? Have you overclocked it? Have you run into similar problems with your PC? Share your experience with us.
https://www.tomshardware.com/pc-components/cpus/intels-patch-for-cpu-instability-and-crashing-issues-rolls-out-from-msi-and-asrock-asus-rog-motherboard-users-can-also-access-a-beta-update
https://www.neowin.net/news/asus-msi-bios-updates-for-intel-13th-14th-gen-unstable-crashing-cpus-now-rolling-out/
Looks like Asus, MSI and soon Asrock are releasing new bios update to address the instability problems with some 13th & 14th gen Intel cpus
Smirk in 14mm+++ 10th gen
@Ashwin
Your a few months late.
Short version, the bios may set the CPU power limit to 4096W by default. On my MSI Z790 Mobo it does this when you set the cooling solution to liquid in the bios.
The fix is to manually set the correct values in the bios.
https://www.tomshardware.com/pc-components/cpus/is-your-intel-core-i9-13900k-crashing-in-games-your-motherboard-bios-settings-may-be-to-blame-other-high-end-intel-cpus-also-affected
This is not the fix. Updating the bios is the first part of the solution. Then, you have to lock the all core frequency to 5.6 GHz. If your CPU is already degraded, you have to also add a positive voltage offset, depending on how much it’s degraded, in +0.01V intervals, until you don’t crash anymore. Optionally, you can increase the loadline level from 4 to 6 (for example) to keep the voltage offset lower, just make sure you don’t get thermal throttled.
Probably won’t recall. Remember the Pentium FDIV bug?
My uncle still has an old i5 3th generation laptop with four cores and it still works like a charm, controlling massive farming appliances and administrative tasks since 2012. This amazing old computer works now with W10 and it’s more reliable imho than my newest laptop with a pretty flamboyant i7 whatever the numbering it has inside. Also my uncle has an old diesel car that is able to run 1600 Km without refueling. My electric car needs to be recharged each 500 Km and I need to plan the route to find electric refueling stations. Newer doesn’t mean better. At least not better as we should think that the “better” word ought mean. Thanks for the article! :]
Old tech is not applicable here. Electromigration happens at smaller node sizes. I’d be surprised if a future 2-3nm CPU will survive one single decade at all. First order principles and physics will affect all vendors alike. Still AMD seems to do everything right at the time.
I had a gaming PC built for my 12th birthday back in 2012 with an ivy bridge i5, still kicks ass today, no problems at all, granted I only play games from that era or older.
“Newer” typically means worse, less free, or compromised in some way these days.
Newer means most of the time new process and new physical limits to overcome, as well as new materials. Pre-2008 we did not even have hafnium oxide gates for high-k.
You are not making a point by saying “older is better” when the materials wouldn’t even be applicable today. Older means really: They had it easy. Today we are talking about quantum field simulations of ever shrinking transistors that are becoming more and more fragile.
So, I’m not convinced that today tech is “worse”. It’s just more prone to contaminate or electromigrate.
Ah, a wisacre…
Okay Gunther:
“So, I’m not convinced that today tech is “worse”. It’s just more prone to contaminate or electromigrate.”
So the “tech of today” in this instance faces different issues when dealing with the intended workloads, which result in a degraded performance, instability, or reduced lifespan.
The “tech of yesterday” faced different challenges (if your opinion, less of them), and seemingly handled intended workloads without degraded performance, instability, or reduced lifespan.
To an end user, can you see how it looks like?
Example:
“Ah, I bought a new car from you, capable of going much faster than my old one, but when I try to go into the territory of the new speed, the car actually sputters, stutters and handles poorly – not always, but sometimes – what gives?”
And Mr genius answers “the new car is so much faster than the old one, do you have any idea the intricacies and engineering involved in making a car go so fast? It’s not trivial stuff!”
The correct answer is: “You used to sell a product which was ready for use, and performed reliably – how about not selling future products until they’re also passed the same level of testing and stability?”
The end user does not care about these specifics, but they have an expectation regarding how the product should function. An expectation you (the manufacturer) created through previous products.