Gemini 3.1 Flash Lite Ships, M5 MacBook Air Announced, and Topaz Cuts VRAM Usage by 95%

Google dropped Gemini 3.1 Flash Lite today. Lightweight, cheap, fast. The positioning is clear: not every API call needs the full model, and Google wants to own the cost-sensitive tier of the inference market.

Apple announced the MacBook Air with the M5 chip. The headline number is a Neural Accelerator embedded in each CPU core, delivering up to 6.9x faster AI video enhancement compared to M1. Pre-orders start tomorrow, shipping March 11. Apple keeps making the case that local AI processing is a hardware differentiation story.

Topaz Labs introduced NeuroStream, a proprietary VRAM optimization technology that claims to reduce VRAM usage by up to 95%. The practical upshot is running models that normally need 16-24GB VRAM on a standard gaming GPU. They built it in collaboration with Nvidia, targeting GeForce RTX and RTX PRO cards specifically. It launched alongside Wonder 2 (Local) for photos, with video support coming soon. Right now it’s Nvidia-only, and Linux support wasn’t mentioned.

For context on the 95% number: if their 4K upscaling pipelines normally want 24GB+ VRAM, NeuroStream could theoretically bring that down to an RTX 4070-class card. Whether that holds up in real benchmarks remains to be seen, but the Nvidia partnership suggests this isn’t vaporware.

Google employees also signed an open letter called We Will Not Be Divided with roughly 900 signatures, about 100 from OpenAI and 800 from Google. It’s a direct response to the Pentagon blacklisting Anthropic, with signatories pushing back against military AI applications in surveillance and autonomous weapons.

Still no DeepSeek V4.