Skylake i3 and i5 NUC WHEA Errors
During the last couple of months there has been an unsettling amount of reports that Skylake NUCs have stopped working properly. Intel even pulled one BIOS update, the BIOS version 33 is not available any more. Typically, you will see WHEA_UNCORRECTABLE_ERROR if you are running Windows or MCE errors if you run Linux. It seems that if your NUC exhibits these problems (frequent crashes with WHEA errors) it has already reached the point of no return. In that case you need to return the NUC to Intel for repair.
Quite a lot of people have been frustrated by the lack of response in the Intel forums. Initially Intel released a FAQ for the Skylake issues, but that hasn’t received updates in the last weeks.
EDIT: Few minutes after posting this article, the FAQ was also updated with the information below.
EDIT 2: It seems that Intel has found the solution and BIOS version 42 should correct the problems. It’s highly recommended to update to BIOS version 42 if your system does not exhibit any WHEA issues yet. If it does, you’ll need to get the unit replaced by Intel.
It finally seems that someone has picked up the ball, as user cvare has posted an update including a date when the next piece of information will come out the latest. The message was:
Hi all,
Intel is actively investigating the “WHEA UNCORRECTABLE ERROR” issue reported on the Intel® NUC Kit NUC6I3SYH, Intel® NUC Kit NUC6I3SYK, Intel® NUC Kit NUC6I5SYH, and Intel® NUC Kit NUC6I5SYK products. We have replicated the issue with BIOS 0039. We do not currently have a workaround for this issue.
Customers who see this error repeatedly on this product are encouraged to contact Intel Customer Support. We will provide an additional update by Friday, April 8, or sooner if relevant information becomes available.
Update on April 8th (thanks for the heads up SebasF1):
Intel continues to actively investigate the “WHEA UNCORRECTABLE ERROR” issue reported by end users on the Intel® NUC Kit NUC6i3SYH, NUC Kit NUC6i3SYK, NUC Kit NUC6i5SYH, and NUC KitNUC6i5SYK products.
Intel has an internal task force diligently working to resolve this issue as soon as possible and deliver a robust solution for affected users. Intel does not currently have a workaround for this issue.
Intel confirms the issue currently being investigated is not corrected by any current BIOS, limiting C-states, updating the operating system software, or replacing peripheral components.
Intel appreciates the patience exhibited by affected users while we work to resolve this difficult issue. Users who see this error repeatedly on this product are encouraged to contact Intel Customer Support for assistance.
Intel will provide another update by Friday April 15th, or sooner if relevant information becomes available.
Update on April 12th, Intel recommends upgrading the BIOS to the recently released version 42 that includes improved “electrical overstress protection”:
Intel is continuing to investigate the “WHEA UNCORRECTABLE ERROR” issue reported by end users on the Intel® NUC Kit NUC6i3SYH, Intel® NUC Kit NUC6i3SYK, Intel® NUC Kit NUC6i5SYH, and Intel® NUC KitNUC6i5SYK products. Intel will continue to provide updates on this ongoing investigation of the “WHEA UNCORRECTABLE ERROR”. We will provide an additional update by Friday, April 22, or sooner if relevant information becomes available.
Intel has released BIOS version 0042 that improves the electrical overstress protection in the voltage regulator circuitry on the Intel® NUC Kit NUC6i3SYH, Intel® NUC Kit NUC6i3SYK, Intel® NUC Kit NUC6i5SYH, and Intel® NUC KitNUC6i5SYK products. While the root cause investigation continues, Intel recommends customers update the Intel NUC to BIOS version 0042.
Final update on April 22nd:
Intel has characterized the “WHEA UNCORRECTABLE ERROR” issue on the Intel® NUC Kits that have been returned to Intel. Intel has released BIOS 0042 which correctly initializes the voltage regulator and eliminates the electrical over-voltage which has been proven to cause the “WHEA UNCORRECTABLE ERROR” issue. Customers who continue to experience the “WHEA UNCORRECTABLE ERROR” issue are encouraged to contact Intel Customer Support.
Intel recommends customers update the Intel NUC to BIOS version 0042. Systems with a product number of SAxxxxxx-503 or later, or systems with a blue dot label added to the Intel NUC and the outer box, have already been updated to BIOS version 0042.
Can you clarify what MCE errors are? I’m running a NUC i3 with Linux and recently experience some system lockups. Not really sure what to look for tho. dmesg seems ok.
Basically you should see kernel oops in the dmesg and there’s a reference to hardware error and machine check event.
Your lockups are probably graphics related. Disable RC6 (i915.enable_rc6=0).
Dammit! i915.i951_enable_rc6=0
I’ve had a few teething issues but thankfully not of this variety – would be interested to know what the catalyst is for it to occur!
So this is for the NUCs before the announced PCN I guess?
(https://communities.intel.com/thread/100085)
I recently sent my NUC back and expecting a new one with the PCN above… I hope it solves the problem.
This is pretty scary! I’ve had my Skylake NUC freeze once or twice and had to restart the machine, but I assumed the cause was Firefox.
I have a nuc6i5syh running Ubuntu 15.10 with the recommended components from this very site: 8GB DD4 RAM from Kingston HyperX, and a Samsung EVO 250GB SSD.
Have upgraded the BIOS to v39, and anxiously awaiting more details from the Intel team. Would like to avoid sending it back for a replacement if I can, but this increasingly sounds like a hardware issue?
Here are three error messages I got after running dmesg:
[ 2.348186] [drm:csr_load_work_fn [i915]] *ERROR* Unknown stepping info, firmware loading failed
[ 2.348202] [drm:csr_load_work_fn [i915]] *ERROR* Failed to load DMC firmware, disabling rpm
[ 9859.493388] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed… render ring idle
My NUC6i5SYH, running Debian “Testing” without (!!!) X-Server freezed, too – with BIOS v36 and Kernel 4.3.x.
Now i am running with BIOS v39 and Kernel 4.4.0 for some days without freeze – i hope forever ;-)
I also see these i915-Firmware-Messages in my ‘dmesg’ (without X-Server) – but i think, these are not the cause of the freezers. The non-free-firmwares are installed.
@Bulent Yusuf The first 2 lines are fine. What it means it that it doesn’t recognize your particular model of skylake, and won’t load the firmware for DMC, which is fine for now, it just won’t use some of the new power saving features. It’ll be fixed in kernel 4.6 or 4.7. The last line is probably incidental, and is because of driver immaturity (I’ve had 1 or 2 GPU freezes on skylake myself). You probably didn’t even notice it too, or maybe your screen froze for a couple of seconds. You should disable RC6 if that bothers you (you’ll loose a bit of power savings again, but it only really matters for battery powered devices). Same for you @kossmann
Btw, I recommend running dmesg -T, since that will print out human readable timestamps instead of seconds-since-last-boot :)
Disable RC6 by Kernel-Paramter “i915.i915_enable_rc6=0” or somewhere in BIOS?
Kernel parameter. This is a GFX driver power saving feature. I don’t think the BIOS can do anything about it.
@Nopants and @kossman, thank you both for the insight, you’ve been really helpful!
I am having trouble understanding all of the issues people are having with this Skylake unit. I have the i5 at home and have a dozen of the i3 or i5 units deployed at our office, and have not had any of these issues. Yes, I am at bios 39 on all of these, but before that saw no issue beyond a bit of heat I didn’t expect under lighter loads. There has to be something to that, but these are from day 1 of the shipping product and the November build of Windows 10. In every case Windows 7 with USB3 support was installed and activated, and then a clean install of Win10 was done. Each unit it running the latest chipset, video, and network drivers, and each device is also using either a pair of 4 or 8 gig Kingston HyperX memory chips.
It’s not something that anyone are doing wrong or you are doing right. It’s the hardware that is not working properly, most probably an error/fault in the hardware design somewhere…..
It just seems very odd that I haven’t experienced this at all. There is clearly some component that is aiding this crash. Also, a clear design difference between the units that I have, which were shipped the same week NUCBLOG got theirs.
I RMA my nuc6i5syh cause of WHEA error. My dealer wanted to change it for a new one, but their distributor decatalogued this model of nuc because of the high rate failure, so they returned my money back. A real disaster for Intel’s reputation. I am now looking at the Gigabyte Brix GB-BSi5H-6200, but I have lots of doubts as it has a skylake processor.
News (very bad ones) from Intel about Whea Errors and NUC6ixxx.
“Intel continues to actively investigate the “WHEA UNCORRECTABLE ERROR” issue reported by end users on the Intel® NUC Kit NUC6i3SYH, NUC Kit NUC6i3SYK, NUC Kit NUC6i5SYH, and NUC KitNUC6i5SYK products.
Intel has an internal task force diligently working to resolve this issue as soon as possible and deliver a robust solution for affected users. Intel does not currently have a workaround for this issue.
Intel confirms the issue currently being investigated is not corrected by any current BIOS, limiting C-states, updating the operating system software, or replacing peripheral components.
Intel appreciates the patience exhibited by affected users while we work to resolve this difficult issue. Users who see this error repeatedly on this product are encouraged to contact Intel Customer Support for assistance.
Intel will provide another update by Friday April 15th, or sooner if relevant information becomes available.”
It sounds like Intel doesn’t have a clue where the fault in the hardware design is, which really makes me worry.
https://communities.intel.com/thread/99692?tstart=0
I really hope this problem is not going to affect the upcoming NUC Skull Canyon since I have my heart on buying it as soon as it comes out.
After getting my initial unit replaced my second unit has also died within 2 weeks of use. Not sure if I want to try a 3rd time around.
Is it related to the m.2 components, perhaps? Those who aren’t affected could be using just a 2.5 SSD?
I got the WHEA error in my new SYK (my third new NUC to get it!!) and Intel offered to refund not only my NUC, but the memory and m.2 drive, which I happily accepted. Now the problem is finding a NUC replacement. Since it’s an Intel problem, I would assume it could happen to any competitor that uses Intel architecture, like the BRIX.
I hope Intel resolves this… it’s a major problem.
Using NUC Blog recommended components–8 GB Kingston HyperX, 256 GB Samsumg EVO M.2 drive–I have not encountered any such problems. Running a 24″ 4k monitor @59hz and I do get CTDs due to the still immature Iris 540 drivers, but no other issues since purchase, unless you count the temporary M.2 drive disappearance after the “33” BIOS.
Currently on BIOS “39” and beta “404” graphics drivers. I actually haven’t had a CTD yet on this setup, but since I haven’t been playing games much lately it could just be that.
Just to clarify: You run these: “Beta 15.40.20.4404”? Because I tried them due to freezes with all the other drivers in some games and the beta driver didn’t use GFX turbo / dynamic frequency, meaning the 540 was clocked at 300MHz. This made the games unplayable.
Yeah, that was the driver. Thanks, I didn’t know it throttled the Iris 540 on my NUC6i5syk. Bummer. As mentioned I haven’t been playing games much since the install and that’s surely why I never noticed. I’ll have to switch drivers if I start playing again. Maybe they’ll have a full-featured update by then.
This is exactly the same setup I have in about a dozen different systems now, the HyperX may be one of the secrets here. No issue whatsoever. It’s very solid DDR4 and any of the value ram that’s out there seems to be part of these blue screens and lock ups. As for graphic drivers I just have the latest from the intel NUC drivers page.
May I suggest that for the meanwhile, people who are looking to buy a NUC, should consider the HP EliteDesk 800 G2 Mini (http://amzn.to/1VcJNGq). It’s about the same size, and features the i7-6700T which has 35W TDP.
And like the NUC6ixxx, it has room for one M.2 and one 2.5″ SATA.
Do you own the EliteDesk G2? How well does it perform. I like it, but I rather have the Intel 580 versus Intel 530.
Emanuel,
I don’t have an EliteDesk G2. I’m considering buying a couple units for my workplace.
I’m guessing the new Skull Canyon NUC (NUC6i7KYK) is gonna do better in GPU performance, but it would be nice if someone could pit them head-to-head and see how they stack up in real world scenarios.
I was planning on buying one of these, they fulfill everything I would want from a new computer, so I could retire my slowly failing Core2Duo machine. I moved almost all of my gaming to consoles now, but for the occasional older game this particular Skylake i5 seems to provide ample power. I would add a 256GB M.2 to my equally sized SATA SSD and storage would be fine as well.
But this is putting me off, I won’t risk buying a faulty machine then having to run for a replacement soon after at these odds. And every alternative features a much less powerful graphics solution or costs a lot more, not good enough.
I hope this gets settled soon, the 8th of April update wasn’t really much different from the earlier “please wait”. If this requires a hardware fix, it would really be better to get some means of differenciation. Many shops will try to sell the old revision long after.
BTW: BIOS 42 just got released.
Yep, people already started using BIOS 42. See here: https://communities.intel.com/thread/101012
I think I’ll hold off for a week or so–or until Olli tries it successfully–to be sure there are no “gotchas”. Intel’s recent BIOS update history for the Skylake NUCs is not something to inspire confidence, to say the least. :p
I just updated to 42 via F7. No problems whatsoever. Settings didn’t change either. I can’t see any difference in behaviour. All good. As before. (elementary OS 0.3.2 on Kernel 4.2.8 ckt7)
Still waiting on my BSOD unit to be RMA’ed, tried BIOS 42 on that and it didn’t fix the problem, so whatever damage has occurred certainly appears to be hardware related. Hopefully this prevents it from occurring however.
I updated to 42 last night. No problems so far…
I was wondering if anyone here is using these RAM modules: http://amzn.to/1SQifAh ??
This 2x16GB G.Skill RAM set is the fastest I could find, running at 2800MHz.
I am not sure whether it would at 2800MHz. You may want to post this in the intel forum.
Sort of worried they will keep those bugs in the Skull Canyon I7 …
Yeah…me too.
They released a PCN (https://communities.intel.com/thread/100085) some time ago, but the “new” NUCs haven’t made it to customers yet afaik*. It sounded like that will fix the problem, and all the current effort is to solve the issue for the NUCs before the PCN. So I don’t think that’ll be skipped in Skull Canyon…
*I’m still waiting for mine. They said it’ll be out of the factory early May.
The PCN issued with the increased capacitors is not designed to fix WHEA errors, but to increase memory compatibility.
I was getting “hardware memory corruption” errors on Ubuntu (changing the memories didn’t help). I somehow thought it was similar to these WHEA errors on Win. Maybe my problem was nothing to be fixed with the PCN after all =/
http://www.computerworld.com/article/3059677/computer-processors/heres-what-the-new-intel-will-look-like.html
I was planning on buying the new NUC6i3 after they fixed the bugs but no more. I imagine the Intel NUC “support” team is now busy polishing their resumes and mailing out resumes.
What a shame because the NUC could have been my perfect pc…
Intel (an Intel employe) now has confirmed that this is a hardware problem. The new BIOS 42 “improves the electrical overstress protection in the voltage regulator circuitry on the Intel® NUC Kit”. (Posting from cvare almost on the bottom of this page: https://communities.intel.com/thread/99692?start=315&tstart=0 )
But “improves protection” doesn’t equal “eliminates problem”.
That really sucks and I’m really curious how Intel intends to handle the situation.
It’s officially confirmed in writing. https://communities.intel.com/docs/DOC-110236
So basically Intel is throttling the CPU with the new BIOS, as not to “overstress” it?
Then I’d rather get the older NUC5.
Intel now changed their wording in their statement. The 42 BIOS now initiates the voltage regulator correctly and thereby eliminates (!) the cause for the electrical overstress.
My 6i3 runs on 39 from the beginning and 42 a day after release. I ran several Linux benchmark tests (Phoronix suite) and can confirm that NOTHING is throttled down.
The new Intel statement reads:
Intel has characterized the “WHEA UNCORRECTABLE ERROR” issue on the Intel® NUC Kits that have been returned to Intel. Intel has released BIOS 0042 which correctly initializes the voltage regulator and eliminates the electrical over-voltage which has been proven to cause the “WHEA UNCORRECTABLE ERROR” issue. Customers who continue to experience the “WHEA UNCORRECTABLE ERROR” issue are encouraged to contact Intel Customer Support.
Intel recommends customers update the Intel NUC to BIOS version 0042. Systems with a product number of SAxxxxxx-503 or later, or systems with a blue dot label added to the Intel NUC and the outer box, have already been updated to BIOS version 0042.
Well well, would you look at that: An official fanless NUC from Intel featuring an i5-5350U:
http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/core-i5-5350u-eval-kit-product-brief.pdf
Why not a 6th gen? Probably because all those WHEA errors…
I’d imagine that the product lifecycle would be a bit longer from design to product than the 2 months that everyone has been talking about the WHEA errors.
So, what is the latest? Did Intel accept defeat with Skylake & working on the next chipset?
There’s nothing wrong with Skylake in general, and from the NUC standpoint i’ve had zero failures. Ironically I had more issues with last rounds Broadwell units.
Mmmm…..I like the “in general” part. :o
I am very interested in getting one NUC6i5SYH. I have been waiting for buying it because of all the problems reported in Intel forums. Attending the latest Intel communications, it seems that WHEA errors and the equivalent issues for GNU/Linux distributions have been solved by Intel by means of a BIOS update. What do you think I should do, buy it now or keep waiting for more feedback?
As said before: I updated to BIOS 39 and then 42 immediately and I have zero problems. NUC6i3SYK with elementary OS running for over a month now. I couldn’t be happier with mine. After Intel changed their wording reg. the 42 update from “improves” to “eliminates” …-problems, I have very high confidence that I’ll stay happy.
Thanks for your words, Alex. I look forward to get my NUC6i5SYH.
I am also planning to buy NUC6i5SYH. Thank you Alex for the info is good news for me.
Regards from North Pomerania Poland.
I also just purchased the NUC6i5SYH and immediately (even before loading an OS) updated BIOS to 42. That was only a few days ago, but I have since been running CPU and memory (with 32 GB) stress tests 24 x 7 on Windows 10 and have seen no issues whatsoever.
I got this error this morning, unfortunately i hadn’t red this post. Now i’ve updated the bios but that didn’t help. i also tried reseting windows but this doesn’t seem to work. Now i have to get a replacement unit and then make sure i update the bios before installing windows. Hopefully this works!
I’m thinking of purchasing this G.Skill 2x16GB 2800MHz RAM set from Amazon: http://amzn.to/239b3on .
These fast sticks should also help the internal Intel GPU to work faster as it relies on the RAM for its duties.
It’s currently on sale for just $202.54, and I’d like to ask if anyone here knows if it’s compatible with the NUC6i5KYH (BIOS v42)? I don’t think I’ve seen this set mentioned anywhere on the NUC compatibility list, but it might still work just fine.
Thanks in advance.
According to the Intel spec sheet on this model, it supports 2133 MHz memory, so you take a chance on that not working at all.
Your results may vary. Not sure if Alex is still problem free, but just upgraded from BIOS 33 to BIOS 42, and still getting WHEA errors. Oracle Virtual Box even just rebooting a VM is easily enough to cause it to fall over.
NUC6i5SYH, G.SKILL Ripjaws Series 32GB (2 x 16G) 260-Pin DDR4 SO-DIMM DDR4 2666 (PC4 21300) Laptop Memory Model F4-2666C18D-32GRS, Samsung 950 PRO -Series 512GB PCIe NVMe – M.2 Internal SSD 2-Inch MZ-V5P512BW.
Even just an xcopy can occasionally set it off although that isn’t as regular.
Wish you all luck. Definitely not 100% fixed…
Terry
Terry, does it recognise the 2666 MHz speed of the memory? Or, it runs on 2133 MHz? (Just curious)
No. Took a while to find it in the BIOS but it is registered as 2133.
Yes, still problem free and happy here.
There is another statement from an intel employee on the nuc forum, basically stating that BIOS 42 will prevent the WHEA when you have a running unit but if you already got the WHEA, BIOS 42 won’t fix it. Your unit is then broken and you should RMA it.
PS: I updated to 39 immediately after unboxing and to 42 when it got released.
This was my experience as well. Had a i5 that got the WHEA error and updating to 42 did not solve it. I got a new unit and updated to 42 before i installed anything else on the system. It’s only been a week so it’s a bit early to celebrate but so far no errors.
I’ve got a new one setup but they won’t give a timeframe. Guess some new ones are going to stores and only some are available for broken. Not a clue when I’ll see on.
Curious on other’s response time. I’m guessing they’re just inundated by requests…
https://communities.intel.com/thread/101514?start=16
“The “WHEA UNCORRECTABLE ERROR” has been resolved with the BIOS version 0042, but unfortunately if your system already got this error the bios version 0042 will not remove the error from your system, if you already have this error I encourage to contact Intel Customer Support to get a replacement for the unit. If the unit is working fine and there is not whea error present, please update the BIOS to the version 0042, this bios version will prevent this whea error.”
Hmmm…don’t know where else to post this. This deserves it’s own post:
The first review of the Intel NUC6i7KYK is OUT!!
http://www.1hd.biz/2016/05/intel-nuc6i7kyk-skull-canyon-first.html
A direct link to the Skull Canyon review:
http://unlocked.newegg.com/intel-skull-canyon-nuc-hands-on-review/
For those who are intersetd, Simply NUC have just tested the “Skull Canyon” NUC6i7KYK with the game “World of Tanks”.
https://www.youtube.com/watch?v=OxxueDmQSpo
(you’re welcome)
Be warned that the Newegg “review” is disappointingly not a review, but rather a preview. Dated 20apr. The guy has been sent a preproduction sample, has taken it apart and installed RAM and SSD, and has a lot to say which repeats the many previews we have read for months. But he’s not powered it up and reviewed anything about its operation.
I agree, it’s not a proper review, but it doesn’t matter much. This guy gave us the first photos on the internals of the new NUC, which you currently can’t get anywhere else.
Some good news for NUC6i5 owners. There is a new BIOS version in the works (1142 beta) which unlocks the full potential of M.2 drives, especially the NVMe type.
BIOS 0042 supports OPI GT2, while BIOS 1142 adds support for OPI GT4, which unlocks the full performance potential. Skylake-U CPUs have the PCH inside the chip, so it’s called OPI (On Package interconnect), as opposed to DMI 3.0 in Skylake-H/S/K, where the PCH is separate from the chip. All these technical terms aside, this means that sequential reads are faster, and so are 4K writes.
Full article on AnandTech: http://goo.gl/o3ko4H
(you’re welcome)
My NUC unit is broken. Unbelievable that Intel screwed up so badly. I have purchased NUC for many years but with this last experience I do not dare to anymore. I cannot afford that the computer suddenly physically breaks.
Did you experience the WHEA error as well? When you get your replacement unit (these things have 3 year warranty), be sure to upgrade the BIOS first. Preferably even before installing the operating system.
Hi, i believed it is a faulty hardware you might check also your hardware its a step by step process you can follow it here
Solution 3. Test hardware
https://www.errorsolutions.tech/error/whea-uncorrectable-error/