Groq is a semiconductor company that manufactures custom chips (TSP—Tensor Streaming Processor) optimized for fast LLM inference. Groq's API provides ultra-low-latency access to models like Mixtral and Llama. Advanced practitioners integrate Groq into performance-critical applications: real-time chat, coding assistants, content generation, customer service bots. The discipline blends prompt engineering, API integration, performance optimization, and understanding when speed matters most. Practitioners trade model capability (Groq models are smaller/faster than GPT-4) for latency gains.