Improving computing power at home for n00bs

chris · March 29, 2025, 1:44pm

I have a fancy (to me) and new Nvdidia 5070ti that i put into a very old Dell XPS 8900 with a Core I7-6700. I maxed out my computer RAM to 64GB of DDR4-something and replaced slow old SATA drives with M.2 2280 SSSD.

I realize my mistake with the 5070ti, only 16GB VGPU. The 5090 with 32GB would have been the better option, on paper at least.

I figured GPU is critical for AI & VLLM, and my old CPU probably didn’t matter too much for me at this very early point of experimentation with AI LLMs and such. But then I compiled PyTorch, lol (trying to "compile with TORCH_USE_CUDA_DSA). 15+hours! Smaller compiles of other projects can run for hours. It feels like the 1980s all over again.

Looking at these new mini-boxes coming out ~soon we have at a basic level the Nvdia Spark + reseller variants, AMD Strix Halo 395+, and the Apple M4 MAX.

I am not sure if VLLM would even run on the Apple M4 MAX but it looks to be best overall box imo with the best bandwidth by far, but I don’t get the idea vLLM or anything non-Apple can make use of the Apple M4 MAX. But then I recalled Apple added Linux a long time ago, so maybe VLLM ARM releases might run on Apple? I have yet to see any mention of anybody’s Macbook, so I’m not sure about Apples being usable at all with non-Apple AI. Anybody know if that works?

Strix Halo 395+ and Nvidia Spark boxes seem to be relatively similar from what I can tell other than the price. (Can you even buy an Nvidia DIGITs box? Their price was a little better than Strix Halo 395+ but are about $1000 higher than now, having been renamed Sparks).

Is it even worth getting one of these? Or does it make more sense to use online services like Modal? I think $3000 to $4000 would go a long, long way on Modal or something. Plus I see a lot of folks talking about running on such and such systems that are beyond the financial reach of most folks even with high-end I.T. related income unless maybe they are I.T. DINKS who live in small houses and don’t blow their cash on fancy cars etc lol. So I figure people mean “online” when they talk about the very-very-expensive GPU they’re running on. Is online the optimal way to go?

OTOH using an online service bring its own bag of tricks. I tried to farm out my 15+ hour build of PyTorch onto Modal (as a Modal 1st-timer) but after working through innumerable mistakes I’d made and finally getting probably ~90% to where Modal would compile for me, I realized that maybe Modal would compile for me, but would it even match my install at home (without me learning and doing a ton more Modal Image setup), and then would the Modal-compiled output even make it back to me at home without a lot more figuring-stuff-out, with each iteration of figuring that out either having to wait an a full build on Modal which could be 10 to 20 minutes (from what i gather please nobody be mad at me if 1 or 2 minutes is the real number) or wait on me to figure out how to persist the built output on Modal and use it as my starting point for figuring out how to pull the built results back to my system (and maybe the “my system” part was just a bad idea to begin with)

During office hours I asked if VLLM would work on a Spark box by the time it works on GeForce 5000’s . The question was taken more as a timeline thing, but what I meant was more "if this GPU I own works ~seamlessly with VLLM, can I infer from 5070ti=good that Spark would also=good? Ie the architectures should theoretically be the ~same (at least the CUDA related parts) between the GeForce and the Spark with both of then being Blackwell. Or might I buy a Spark on release day assuming it’ll work bc the 50x0 GPU works, but then be struggling with the Spark box for months trying to con it into working before support for it is officially there?

Another way to put it is “On some Date X, whatever real date X may be, given Pytorch & VLLM release versions supporting my 5070ti and running normally without my having to get under the bonnet too much for reasons beyond any peculiarities in my own local box/environment, architecturally speaking should i expect the Spark would also be supported by release versions on that same Date X?” Or maybe the “integrated memory” might require more work for the very smart folks at Pytorch and Vllm, leaving this dummy (me) with a fancy new thing that doesn’t do anything (yet) for some period of time even though the 5070ti is working pretty seamlessly and well?

Wow that went all over the place. Sorry.

Quick recap: I’m seeking advice for AI/MK/etc n00bs who have crappy old computers. Should we buy one of the mini-AI-boxes coming out this year or do online services? And if we buy a new mini-AI-box, should we expect to wait some period of time for PyTorch & VLLM to support a new mini-AI-box (as with the Blackwell GPUs and, I bet, probably with all prior GPU architecture changes too)

DystopianJunkyardKid · March 30, 2025, 2:33pm

First, try an old-day enterprise server as your “mainframe” .

chris · March 30, 2025, 2:49pm

Ooohhh Thank you. That hadn’t occurred to me, finding some big company that buys the latest-greatest enterprise servers annually and buy a prev-gen server off them. Hmmmmmmm. Think I will go poke around in the dumpsters out back of Home Depot corporate now.

chris · March 30, 2025, 3:50pm

One hour ten minutes into my latest python setup.py develop it’s on step 76 of 140… FA2 Definitely going to look at used enterprise servers. I will do that right now instead of spamming TS out of this site

chris · March 30, 2025, 4:41pm

2 hours in, vllm build is on step 113/140, still doing FA2

Got a refurb/certified HP Proliant DL580 G9 4U Server 4x E7-8870 V3 2.10ghz 72-Cores 384gb Ram 4x Trays on the way. Builds won’t take this long next week!

Seems like this DL580 G9 will be so much better than the myriad new $2000+ mini-AI boxes coming out, and at $770 it cost less than half the cheapest mini-AI boxes being marketed so hard as coming “later this year”

Thank you so, so much for the recommendation of looking at old enterprise servers! Only concern i’ll have now will be my next electricity bill, lol.

chris · March 31, 2025, 11:37am

Having just one old-day enterprise server, heat won’t become a problem for the just-one server like it can up at the data center will it? I figure one enterprise server with open space on all sides will be fine with just normal household HVAC and probably a normal household box fan making sure air circulates. It’s racks of servers with maybe a 1/2u between them, if any vertical space whatsoever in between them, that accumulate heat, and they accumulate the heat because their hot neighbors are packed in like sardines. (You’d think servers would end up swinging what with living so closely alongside so many hot neighbors lol.)

I do realize one enterprise server will heat up my room enough to make me, a human, think it’s a bit warm, but my question/concern here is the server itself. The room where this server will live stays cool relative to the rest of the house, so I am hoping for an entire environmental concern of “the server should make a nice space heater for my toes in the cold months.”

OH - dust. Do we treat our at-home enterprise server rooms same as data-centers as if they’re ~clean rooms? For example right now the pollen level is over 11,000 where I live. The yellow stuff gets EVERYWHERE. Every road around here leads to Oz this time of year. Obviously I don’t want pollen in my server. That goes without saying. But just how much anti-dust measures are advised for an at-home enterprise server that’s manufactured/built with an expectation of operating within a ~clean room environment? Or at ~$750 every ~couple years do we just let dust do its dirty work and buy a new used server as needed because we need new servers more frequently due to obsoletion than we might need due to household dust / pet hair etc?

DystopianJunkyardKid · April 1, 2025, 2:10pm

Check heat sink first. The heat sink without heat pip means a server in a low-temp fresh air configuration server rack.Upgrade the heat sink with heat pip ones.
DO NOT remove server lid. The lid making the air flow directional cause the cooling legit.
You can DIY an active HEPA air filter system and feed the intake, yes it is noisy.

chris · April 2, 2025, 11:31am

Thank you so much! My new used server arrives today.

72 cores w/384 gb ram will be quite an improvement from 6 cores and 64gb ram, but costs less than the last very-average desktop computer I bought at a big box store.

Will check the heat sink(s) and upgrade if needed. DIY HEPA is a brilliant idea. I am pretty sure The Wife has at least one HEPA filter device sitting around unused. Now it’s mine!

I have been remodeling my basement to be an office/computer-lab. Think I will now include a server closet with extra HVAC coming in low, a hot air return duct exiting high, DIY-HEPA filtering for the whole closet airflow, plus baffles for the noise. Even with extra construction (DIY), my cost for the used (but refurbished) enterprise server will still be less than an average desktop computer.

Might have to buy a “portable” air conditioner by next winter when I will use the heater for the house. Will monitor temperatures in November.

Thank you, thank you, thank you, thank you! Plus, extra thanks for answering questions that could be DIY researched. Is not so easy to find good information on the internet anymore due to marketing material everywhere. I am thankful we could seed this board with a good discussion that helps me and will also help folks to come.

Topic		Replies	Views
Can vLLM built for old GPU (GT 630M) ? It may use CUDA 9.1.85 Hardware Support	1	21	August 4, 2025
vLLM install for 5090 General	1	526	August 2, 2025
Make install easier General	11	107	July 24, 2025
Installation issue General	8	124	July 3, 2025
Making best use of varying GPU generations NVIDIA GPU Support	2	166	April 11, 2025

Improving computing power at home for n00bs

Related topics