Serving the Washington, DC Region

The Power of generative AI.
Without the Privacy Risks.

Run AI from OpenAI - a top AI provider - locally on your own hardware using free software with no usage fees. From small businesses to private research - Sam Spencer AI helps you build the bridge between absolute data privacy and cutting-edge intelligence.

While downloaded AI is not the same as the huge online versions, the model we recommend has been specially trained to be a powerhouse and offers impressive math skills. This model does not run on a standard PC or Mac, but affordable options are available. Please check below for more details or click on this page for a free consult.

DIYers can see detailed instructions on this page below to get going today. If you ever need help, request a free consult and we'll see what we can do. After the one-time free consult, further consults are billed at $10/6 minutes.

Book Free Phone Consultation

Absolute Privacy

Your data never leaves your premises. No cloud leaks, no third-party training on your proprietary information.

Zero Subscriptions

Own your intelligence. Stop paying monthly fees to big tech and run unlimited inference on your own terms.

Offline Capability

Work anywhere, anytime. Your AI workstation runs perfectly without an internet connection - But it can by link to scrape your competitors website, etc.

Recommended AI Workstations

We researched some great hardware to get you started. A PC with at least 20GB of VRAM is recomended and below that, things slow to a crall by 12GM. Below that and the model will not really be useful. Here's some great options for anyone on a budget.

Minisforum AI Berzy

~$800

Entry Level for LLM

The Ryzen AI 9 HX 370 provides the most affordable entry point for private, local LLM operation.

View Details (not a sponsored link)

MSI Vector 16 HX AI

~$2,000+

High End Laptop

A mobile powerhouse. The RTX 5080 (16GB) is the gold standard for portable image and video generation.

View Details (not a sponsored link)

The Virtual Alternative: Shadow PC

~$50/MO

Instantly get 16GB-20GB VRAM on any device using their "Power" upgrade. Perfect for running gpt-oss-20b without buying new hardware.

Low Latency Pro GPU
Explore Shadow (not a sponsored link)

Hardware Deep Dive

You can add a graphics card to a PC you already have because you are going to need at least 20GB of VRAM for smooth running local AI, or 16GB minimum. Sam Spencer AI can help guide your shopping. Any technology store like MicroCenter or BestBuy can help you shop and install the right equipment (Pro tip: Take a photo of the label on back of your PC to show the technician in the store).

NVIDIA (CUDA)

RTX 4060 Ti (16GB Version)

The essential baseline. Warning: Avoid the 8GB model. 16GB VRAM is necessary for running 12GB+ models comfortably.

RTX 3090 / 4090 (24GB)

The Gold Standard. Massive 24GB VRAM allows for high-quantization Llama 3 or Stable Diffusion XL without breaking a sweat.

RTX 5080 (16GB) / 5090 (32GB)

The Blackwell Era. Next-gen architecture optimized for high-speed generative AI and massive performance gains.

AMD (ROCm)

Radeon RX 7900 GRE (16GB)

The Value Play. Incredible raw performance for users comfortable with the ROCm environment.

Radeon RX 7900 XTX (24GB)

High-Memory Alternative. A powerful 24GB competitor that often beats NVIDIA on pure price-to-VRAM ratio.

Radeon AI Pro W7900 (48GB)

Enterprise Research. Massive memory for professional local fine-tuning and massive context windows.

Hardware Sourcing & Deal-Finder

$10 per 6 mins (scheduled as needed)

Navigating used PC deals on Craigslist or Jawa? I can help you source a system that matches your needs and expectations without breaking the bank.

DIY: Run OpenAI's gpt-oss Locally

Skip the subscriptions. Use the guide below to install OpenAI's first open-source model on your own hardware.

1

Hardware Check

  • 16-20GB VRAM Minimum: For the 20B parameter model to run.
  • 20GB Disk Space: Ensure you have at least 20GB free for the model and software.
  • Speed Warning: Running the model from an external HD or SD card will cause extremely slow load times. For best performance, use an Internal SSD.
2

The Setup

1. Download LM Studio & AnythingLLM.

2. Open LM Studio and search for gpt-oss-20b.

3. Download the GGUF version. LM Studio will automatically store this in your local models folder.

3

Link & Launch

  1. Load: In LM Studio, click the AI Chat icon and select gpt-oss-20b from the top dropdown.
  2. Server: Go to the Local Server tab and click Start Server.
  3. Link: In AnythingLLM, go to Settings > LLM Preference. Set Provider to LM Studio and it will link automatically! Now use AnythingLLM as your chat. Upload files in the prompt or look into using downloadable RAG plug-ins to safely safely point to your local private files.
  4. You are ready: Use AnythingLLM as your chat bot. Go into setting and hit the slider for web access and select a browser (we recommend DuckDuck Go for its privacy focus). And if your a DIYer, check out all the downloadable RAG tools that let you interact with your local files without having to put them in your prompt each time.
Sam Spencer

Sam Spencer

"I help professionals and business owners reclaim their privacy while maximizing their potential with local intelligence."

Serving the Washington, DC Metropolitan Area

Free Phone Consultation | $200 White-Glove Installation

sam@samspencerai.com