⚙️ System Overview
Comprehensive hardware analysis for local AI model performance on this system.
🧠 Processor
💾 Memory
Plenty of RAM for large language models up to 70B parameters in quantized formats.
🎮 Graphics
Integrated GPU is suitable for text LLMs but limited for image generation compared to dedicated GPUs.
💿 Storage
Sufficient space for multiple large language models and datasets.
🐧 Operating System
Arch Linux provides latest software packages and kernel updates ideal for development.
🛠️ Software Stack
🤖 Large Language Model Performance
Expected performance for various LLM sizes running via Ollama on Samus:
| Model Size | Example Models | RAM Required | Expected Performance | Tokens/Second | Recommendation |
|---|---|---|---|---|---|
| 3B-7B | Llama 3.2 3B, Phi-3, Mistral 7B | 4-8 GB | Excellent | 30-50 t/s | ✅ Highly recommended - Fast and responsive |
| 8B-13B | Llama 3.1 8B, Gemma 2 9B, Llama 2 13B | 8-16 GB | Excellent | 15-30 t/s | ✅ Recommended - Great balance of quality and speed |
| 14B-34B | Qwen 2.5 32B, CodeLlama 34B, Mixtral 8x7B | 16-32 GB | Good | 8-15 t/s | ✅ Usable - High quality, moderate speed |
| 40B-70B | Llama 3.1 70B (Q4), Qwen 2.5 72B (Q4) | 35-50 GB | Moderate | 3-8 t/s | ⚠️ Possible but slow - Use quantized versions |
| 100B+ | Llama 3.1 405B, GPT-J 178B | 80+ GB | Not Recommended | <2 t/s | ❌ Too large - Use cloud APIs instead |
⚡ CPU Inference Note
Since Samus uses integrated graphics, LLM inference will run on CPU. The 8-core Ryzen 7 7730U with 62GB RAM is well-suited for this, but inference will be slower than on systems with dedicated GPUs (typically 3-5x slower than NVIDIA RTX 4090).
Quantization is key: Use Q4_K_M or Q5_K_M quantized models for best balance of quality and performance.
🌟 Recommended Ollama Models for Samus
These models will run smoothly on your hardware:
💬 For Conversation & General Tasks
Fast, responsive, great for daily use.
💻 For Coding
Optimized for code generation and debugging.
🧠 For Advanced Reasoning
Higher quality responses, still fast enough.
📚 For Maximum Quality (Slower)
Top-tier models, expect 3-8 tokens/second.
🎨 Image Generation Performance
Analysis of AI image generation capabilities on integrated AMD Radeon graphics:
| Tool/Model | Resolution | Expected Time | Feasibility | Notes |
|---|---|---|---|---|
| Stable Diffusion 1.5 | 512x512 | 3-5 minutes | Possible | CPU inference - very slow but functional |
| Stable Diffusion XL | 1024x1024 | 10-15 minutes | Not Practical | Too slow for regular use |
| Flux | 1024x1024 | 15-20 minutes | Not Recommended | Requires dedicated GPU with 12GB+ VRAM |
| Cloud Services (Midjourney, DALL-E) |
Any | 10-60 seconds | Recommended | Best option for image generation on this hardware |
🎨 Image Generation Recommendation
For this hardware configuration, we recommend using cloud-based image generation services:
- Midjourney ($10-60/month) - Highest quality
- DALL-E 3 via ChatGPT Plus ($20/month) - Good quality, easy access
- Leonardo.ai (Free tier available) - Good for testing
- Replicate API (Pay per use) - Flexible pricing
Local generation is technically possible but impractically slow on integrated graphics.
🧠 OpenMemory Performance on Samus
How well OpenMemory will perform with your hardware:
✅ Excellent Match!
Samus is well-suited for running OpenMemory with local LLMs:
- 62GB RAM allows running multiple model instances or very large models
- 8-core CPU provides good parallelization for memory operations
- Fast NVMe storage ensures quick memory persistence and retrieval
- Ollama integration works perfectly with CPU-based inference
📊 Recommended OpenMemory Configuration
⚡ Expected Performance Metrics
| Operation | Model Used | Expected Speed | Quality |
|---|---|---|---|
| Memory Storage (50KB context) | Llama 3.1 8B | 5-10 seconds | Excellent |
| Memory Retrieval (semantic search) | nomic-embed-text | <1 second | Excellent |
| Query Processing | Llama 3.1 8B | 2-5 seconds | Excellent |
| Synthesis (combining memories) | Llama 3.1 8B | 3-8 seconds | Very Good |
⚖️ Comparison: Samus vs. Other Configurations
| Configuration | CPU | RAM | GPU | LLM Speed (13B) | Image Gen | Cost |
|---|---|---|---|---|---|---|
| Samus (This System) | Ryzen 7 7730U 8c/16t |
62 GB | Radeon (iGPU) | 15-25 t/s | Not practical | $0 (owned) |
| Gaming Desktop (Mid-Range) | Ryzen 5 7600X 6c/12t |
32 GB | RTX 4060 Ti (16GB) | 40-60 t/s | Good (SD/SDXL) | ~$1,500 |
| Gaming Desktop (High-End) | Ryzen 9 7950X 16c/32t |
64 GB | RTX 4090 (24GB) | 80-120 t/s | Excellent (Flux) | ~$3,500 |
| AI Workstation | Threadripper 24c/48t |
128 GB | 2× RTX 4090 | 150-200 t/s | Excellent | ~$8,000 |
| Cloud (Vast.ai RTX 4090) | Variable | Variable | RTX 4090 (24GB) | 80-120 t/s | Excellent | ~$0.30-0.70/hr |
🚀 Upgrade Path (If Needed)
If you want to improve local AI performance, here are the most impactful upgrades:
Option 1: External GPU
Moderate ImpactIf Samus has Thunderbolt 3/4, you could add an external GPU (eGPU) enclosure:
- RTX 4060 Ti (16GB) - $500 + $300 enclosure = $800
- Would enable: Fast image generation, 2-3x faster LLM inference
- Limitation: Thunderbolt bandwidth bottleneck (~30% performance loss)
Option 2: Desktop Companion
High ImpactBuild a dedicated AI desktop for heavy workloads:
- Ryzen 9 7900X + RTX 4070 Ti Super (16GB) - ~$1,800
- Keep Samus for portability, use desktop for AI generation
- Access remotely via SSH/RDP when away
Option 3: Cloud Hybrid
Cost EffectiveUse cloud GPUs for intensive tasks:
- Samus: Fast LLMs (7B-13B), OpenMemory, development
- Cloud (Vast.ai/RunPod): Image generation, 70B+ models
- Cost: ~$20-50/month for occasional heavy use
Option 4: Do Nothing
RecommendedSamus is already excellent for most AI tasks:
- 62GB RAM handles up to 70B quantized models
- Great for OpenMemory, coding assistants, research
- Use cloud services ($10-20/month) for image generation
- Total cost: Much less than hardware upgrades
📦 Detailed Software Inventory
Comprehensive catalog of all tools, libraries, and utilities powering the OpenMemory project:
💻 Core Development Environment
AI Assistant
- Claude Code: Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
- Purpose: Primary AI development assistant with OpenMemory integration
- Implementation: Persistent memory across sessions, autonomous debugging, documentation generation
Runtime Environments
- Node.js: v25.1.0 - JavaScript runtime for backend services
- Python: 3.13.7 - Data processing and benchmarking scripts
- TypeScript: 5.6.3 - Type-safe development environment
Package Managers
- npm: 11.6.2 - Node package management
- tsx: 4.19.2 - TypeScript execution and hot reload
Version Control
- Git: 2.51.2 - Source code management
- Implementation: Project history tracking, collaboration, deployment
🧠 OpenMemory Backend Stack
Core Framework
- OpenMemory Backend: v1.0.0
- Architecture: Framework-agnostic memory system for LLMs
- Purpose: Persistent, multi-sector memory with temporal decay and intelligent retrieval
Database Systems
- PostgreSQL (pg): 8.16.3 - Primary database driver
- SQLite3: 5.1.6 - Lightweight local database option
- Implementation: Memory persistence, vector embeddings, temporal tracking
Protocol & Communication
- Model Context Protocol SDK: 1.20.2
- WebSocket (ws): 8.18.3 - Real-time bidirectional communication
- Purpose: LLM integration via standardized MCP protocol
Data Processing
- Zod: 3.23.8 - TypeScript schema validation
- dotenv: 16.4.5 - Environment configuration management
- Mammoth: 1.11.0 - Microsoft Word (.docx) document parsing
- pdf-parse: 2.4.3 - PDF document extraction
- Turndown: 7.2.1 - HTML to Markdown conversion
🤖 Local LLM Infrastructure
Ollama Platform
- Ollama: 0.12.9
- Purpose: Local LLM deployment and inference
- Implementation: Running quantized models (7B-70B parameters) locally
- Integration: Connected to OpenMemory for persistent context
Supported Models
- Llama 3.1: 7B, 13B, 70B variants
- Qwen 2.5 Coder: 7B - Optimized for coding tasks
- DeepSeek Coder: 6.7B - Alternative coding model
- Mistral: 7B - General-purpose model
🎨 Frontend & Documentation
Rendering Libraries
- Marked.js: Latest - Markdown to HTML conversion
- Highlight.js: 11.9.0 - Syntax highlighting for code blocks
- Chart.js: Latest - Interactive benchmark visualizations
Web Technologies
- HTML5 Canvas: Interactive memory network visualization
- CSS3: Modern styling with gradients, animations, responsive design
- JavaScript ES6+: Client-side interactivity and physics simulation
Documentation Pages
- Implementation: 9 interconnected HTML pages with unified navigation
- Features: Interactive visualizations, benchmark reports, session logs
- Design: Cosmic dark theme with sector-specific color coding
🔧 Development Tools & Utilities
Type Safety & Linting
- TypeScript Compiler: 5.6.3 - Static type checking
- @types/node: 20.10.5 - Node.js type definitions
- @types/pg: 8.15.5 - PostgreSQL type definitions
Build & Execution
- tsx: 4.19.2 - TypeScript execution without compilation
- tsc: TypeScript compiler for production builds
- Scripts: npm dev (hot reload), npm build, npm start
Testing & Benchmarking
- Custom Benchmark Suite: 8 comprehensive analyses
- Metrics: Memory efficiency, retrieval accuracy, temporal decay, context preservation
- Validation: 30-50% token reduction, 95%+ semantic accuracy
📁 Project Structure
Core Components
- /backend: OpenMemory server (Node.js + TypeScript)
- /sdk-js: JavaScript/TypeScript SDK (v0.2.0)
- /IDE: IDE integrations (v1.0.1)
- /docs: HTML documentation ecosystem (9 pages)
Memory Architecture
- Sectors: Semantic, Procedural, Episodic, Reflective, Emotional
- Features: Temporal decay, importance weighting, intelligent retrieval
- Storage: PostgreSQL or SQLite with vector embeddings
⚙️ System Requirements & Dependencies
Minimum Requirements
- Node.js: v18.0.0 or higher (v25.1.0 recommended)
- npm: v9.0.0 or higher (v11.6.2 recommended)
- Python: 3.10+ for benchmarking scripts (3.13.7 recommended)
- RAM: Minimum 8GB (16GB+ recommended for larger models)
- Storage: 10GB minimum (50GB+ for multiple LLM models)
- OS: Linux (Arch, Ubuntu, Debian), macOS, or Windows (WSL2 recommended)
Database Requirements
- Option 1 - PostgreSQL: v12+ (recommended for production)
- Install:
sudo pacman -S postgresql(Arch) - Setup: Create database and configure connection string in .env
- Extensions: pgvector for vector similarity search (optional)
- Install:
- Option 2 - SQLite: Included with Node.js (sqlite3 package)
- No separate installation required
- Ideal for development and single-user deployments
- Built-in with OpenMemory backend
Optional Dependencies (LLM Support)
- Ollama: v0.12.0+ for local LLM inference
- Install:
curl https://ollama.ai/install.sh | sh - GPU Support: CUDA 11.8+ (NVIDIA) or ROCm 5.7+ (AMD)
- Models: Download via
ollama pull llama3.1:7b
- Install:
Build Dependencies
- C++ Compiler: Required for native modules (sqlite3, pg-native)
- Arch Linux:
sudo pacman -S base-devel - Ubuntu/Debian:
sudo apt install build-essential - macOS: Xcode Command Line Tools
- Arch Linux:
- Python Build Tools: For Python dependencies
- pip, setuptools, wheel (usually included with Python)
🌐 Virtual Environment & Installation
Node.js Environment (Backend)
cd OpenMemory/backend
nano .env # Configure database connection
npm start
Python Virtual Environment (Benchmarking)
venv\Scripts\activate # Windows
Environment Variables (.env)
MCP_HOST=localhost
MEMORY_MAX_TOKENS=4000
EMBEDDING_MODEL=text-embedding-ada-002
Docker Support (Optional)
💡 Note: OpenMemory is designed to work seamlessly with Node's built-in module system. No additional virtual environment management is needed beyond npm's package isolation.
🤖 Claude Code Integration
MCP Configuration
Configure Claude Code to connect to OpenMemory via Model Context Protocol:
"mcpServers": {
"openmemory": {
"command": "node",
"args": ["/path/to/OpenMemory/backend/dist/server/index.js"],
"env": {
"DATABASE_URL": "sqlite://./data/memory.db"
}
}
}
}
Available MCP Tools
- store_memory: Save new memories to any sector
- recall_memory: Retrieve relevant memories by query
- list_memories: Browse all stored memories
- update_memory: Modify existing memory entries
- delete_memory: Remove memories by ID
- get_stats: View memory system statistics
Current Instance
- AI: Mnemosyne (Claude Sonnet 4.5)
- Memory Backend: SQLite (local persistence)
- Active Memories: 25+ nodes across 5 sectors
- Session Duration: Multi-day persistent context
- Capabilities: Autonomous debugging, memory-guided coding, context preservation
📝 Summary & Verdict
Samus is very well-equipped for local AI work, especially for LLMs and OpenMemory.
✅ Strengths:
- Exceptional 62GB RAM - can run very large models
- 8-core Ryzen 7 provides good CPU inference performance
- Perfect for Ollama + OpenMemory workflows
- Fast NVMe storage for model loading and memory persistence
⚠️ Limitations:
- Integrated GPU limits image generation capabilities
- CPU inference is 3-5x slower than dedicated GPU
- Not suitable for real-time AI applications
💡 Recommendations:
- For LLMs: Use local Ollama with 7B-13B models (excellent performance)
- For image generation: Use cloud services (Midjourney, DALL-E 3)
- For OpenMemory: Perfect as-is, no changes needed
- For coding: Use qwen2.5-coder:7b or deepseek-coder locally