Start a Conversation

Solved!

Go to Solution

1 Rookie

 • 

3 Posts

26

April 23rd, 2024 19:49

PowerEdge 2970 DIMM MBE Issue

Kind of using this as a last resource as I cannot seem to fix this on my own. I have a new to me PowerEdge 2970 (yes I know the server is well past support and should absolutely be replaced by today's standards; however, it is only used for home/remote storage and it fits my needs perfectly for what it does). The server had very low use from its previous use.

When it came into my possession, I noted it only had one AMD OS2374 and 8GB of DIMM. So to optimize it I went ahead and bought another OS2374 processor (along with another heat sync) and a total of 32GB of DIMM (8 matching 4GB sticks) for the server. Fast forward, upon first boot post install the notification screen immediately showed two MBE 2110 (one for banks 5&6, another of 7&8). So I immediately thought there was a bad DIMM. I swapped DIMMs from both banks (between 1&2 and 5&6, then 3&4 and 7&8). Reset the errors, restart, same errors. No matter what sticks are in the banks for Processor 1, that side never throws a MBE.

I then went to switch processors, I pulled both processors, switched their respective spots, re pasted the heat syncs, same errors on the same banks (5&6 and 7&8)...

In the BIOS setup, the setup does see the ram, it just only sees 16GB as usable. In Windows it sees: 32GB (16GB Available).

I am currently running Windows Server 2012 R2 and I have the Openmanage Server Administrator and it shows MBEs on banks 5,6,7, and 8. The second processor is seen (with no errors) both after performing POST and in Windows. The DIMM in POST it shows 16GB, shows me an error for banks 5,6 and 7,8 and makes me press F1 to continue (or F2 to setup).

My question is this, is there something I failed to do since I added the second processor and those banks were not used? I opened the BIOS setup and I do not see any option to allow the other banks to be used, but I could be missing something. I understand it could be a motherboard error and honestly if that's what it is, I am just going to keep running the server as is until it fails or I decide to upgrade.

Again I know this server is well past its life expectancy; however, it fits my needs just fine for now. It would be nice to clear the errors once and for all. There are no other errors anywhere else on the server, I replaced the CMOS battery as well as the PERC6 battery.

Lastly, if you made it this far thank you for your help

Moderator

 • 

3.2K Posts

April 24th, 2024 11:59

Hi,

the error message "Warning! One or more faulty DIMMs found on CPUn" indicates that there is a problem with the memory modules used by CPU n (where n is the CPU number).  This suggests that the memory modules in banks 5, 6, 7, and 8 are faulty or improperly seated.The fact that the issue persists even after swapping the processors and the DIMMs between the banks points to a potential motherboard or system-level problem, rather than an issue with the individual components. The BIOS only recognizing 16GB of usable memory out of the 32GB installed is also a symptom of the MBE issue. Unfortunately, there does not appear to be a straightforward solution to this problem based on the information provided. The manual suggests that the next step would be to update the BIOS firmware, as indicated by the warning "Update the BIOS firmware. See 'Getting Help' on page 147." If updating the BIOS does not resolve the issue, it is likely that the motherboard or the memory controller on the CPUs is faulty, which would require more extensive troubleshooting or replacement of the affected components. Given the age of the PowerEdge 2970 server, it may be more practical to consider upgrading to a newer system that can better accommodate your needs. However, if the server is still meeting your requirements, you can continue using it as-is until it fails or you decide to replace it. 

Moderator

 • 

3.6K Posts

April 24th, 2024 03:46

Hello, thanks for choosing Dell and welcome to our community.

Could you please refer to this?

https://dell.to/3QhMFTw

 

Page 6

 

Emerald Rapids require RHEL 9.3 or late

1 Rookie

 • 

3 Posts

April 24th, 2024 10:20

@DELL-Young E​ 

hey there, I don’t really see what this has to do with my issue. I am not running Linux at all on this server…

1 Rookie

 • 

3 Posts

April 24th, 2024 14:21

@Dell-Martin S​ 

thank you sir, this exactly where I am at with it. I definitely do not see it as worth it to do a motherboard swap as the server is working fine as is (other than the error at POST). The BIOS was updated upon loading the Dell Update utility when I got my Raids setup and windows loaded up. It didn’t help anything.

I will probably just keep an eye out for a suitable replacement for when the time comes.

Thank you for your time which confirmed my thoughts.

(edited)

1 Rookie

 • 

4 Posts

April 25th, 2024 22:32

The "DIMM MBE issue" on a PowerEdge 2970 server typically refers to a Memory Bit Error (MBE) detected on one of the DIMMs (Dual In-Line Memory Module). Here's how you can address this issue:

1. **Identify the Faulty DIMM**: Start by identifying which DIMM is causing the MBE issue. Most server hardware includes diagnostic tools or management software that can help you pinpoint the problematic DIMM.

2. **Replace the Faulty DIMM**: Once you've identified the faulty DIMM, replace it with a new one. Make sure to use compatible DIMMs that meet the server's specifications, including speed, type, and capacity.

3. **Update Firmware and Drivers**: Ensure that your server's firmware (BIOS) and drivers are up to date. Sometimes, firmware updates include fixes for memory-related issues or improve system stability.

4. **Run Memory Tests**: After replacing the faulty DIMM, run memory tests to verify that the issue has been resolved and that the server's memory subsystem is functioning correctly. Many server management tools include built-in memory diagnostics that can help with this.

5. **Monitor for Recurrence**: Keep an eye on your server's logs and monitoring tools for any signs of recurring memory errors. If you continue to experience MBE issues, there may be underlying hardware problems that need to be addressed.

6. **Consider System Health**: Evaluate the overall health of your server hardware. Persistent memory errors could be indicative of broader issues with the server's components, such as the memory controller, motherboard, or power supply.

7. **Contact Support**: If you're unable to resolve the issue on your own, consider reaching out to Dell EMC support or consulting with a qualified technician for further assistance. They can provide specialized guidance and assistance based on your specific server configuration and environment.

By following these steps, you can troubleshoot and address the DIMM MBE issue on your PowerEdge 2970 server, ensuring reliable performance and stability for your IT infrastructure.

No Events found!

Top