More memory, same problems
Over the Memorial Day weekend, I find an eBay store selling compatible ECC DRAMs for a surprisingly low price. I grab two 8gb sticks- the Dell requires memory to be installed in pairs- for the price of a couple of pizzas.
I also track down a pdf of the Dell machine's manual and copy it to a directory on my home file server. I've been doing this with manuals for a few years with manuals, especially those odd little devices whose documentation consists of one big foldout page that I usually wind up losing. I'm running nginx to serve these up, so I can view them from a browser any time I'm on my home network.
Once the memory arrives, I power down the Dell and pop open the case.
The connectors on the slots for each memory lane are color-coded:
The grey thing on the lower left is an airflow guide forcing cool air over that ginormous heatsink. It's got a clever quick release mechanism (not shown), and once I pop it off I have easy access to the memory slots:
Firmly press in the new memory sticks, close everything up, and behold!
It takes several minutes for Proxmox to spin up all four VMs. I wait a bit longer for the Kubernetes services to chat with each other, and login to node-0. I get a login prompt in a couple of seconds, and, overall, the system is pretty responsive- much better than before
I still have the same error, though:
networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
I decide to do some housekeeping before digging into this further.
First, I apply the latest firmware update, from 2018. From the release notes this primarily addresses some of the speculative execution bugs that have been discovered in Intel processors over the years. While this won't fix my Kubernetes problem, it might give me a slight performance boost, since sometime the Linux kernel enables software mitigations for unpatched firmware bugs.
And it's just good computing hygiene.
Dell provides a convenient Linux executable to apply this update from the command line:
root@vega:~# ./T110_BIOS_C4W9T_LN_1.12.0.BIN Collecting inventory... . Running validation... Server BIOS 11G The version of this Update Package is newer than the currently installed version. Software application name: BIOS Package version: 1.12.0 Installed version: 1.3.4 Continue? Y/N:y Executing update... WARNING: DO NOT STOP THIS PROCESS OR INSTALL OTHER PRODUCTS WHILE UPDATE IS IN PROGRESS. THESE ACTIONS MAY CAUSE YOUR SYSTEM TO BECOME UNSTABLE! .......................................................................................
I then do some tinkering with the CPU settings of the VMs. While I was researching the virtualization capabilities of the Dell's Xeon CPU I discovered that Xeon "Lynnnfield" processors are part of the "Nehalem" processor family, which is one of the CPU options in Proxmox' VM configuration.
I stop node-0, change it's hardware processor setting from "Conroe" to "Nehalem", and restart it.
It works! lscpu identifies the processor as Intel Core i7 rather than Celeron, and the CPU flags show SSE 4 as being supported.
Next, I'm going to explore Proxmox further. I want to create a VM template so I can quickly spin up clean VMs, and see if I have better success with minikube or k3s than I have with the full Kubernetes distro. Hopefully, having a working Kubernetes install to compare against my current one will give me enough clues to identify the problem.
At worst, I'll have a working Kubernetes- just not the one I expected.