logo

View all jobs

AI Infrastructure Engineer

Chantilly, Virginia

AI Infrastructure Engineer

Top Secret or TS/SCI is required to start 
$200K to $250K 
Chantilly, VA

What You'll Do

  • Deploy and optimize self-hosted LLM inference servers (vLLM, Ollama, and similar).
  • Containerize AI workloads using Docker and orchestrate production environments with Kubernetes, including GPU scheduling.
  • Build and maintain AI serving infrastructure, including gateways, load balancing, authentication, TLS, and rate limiting.
  • Optimize GPU utilization, memory management, quantization, batching, and capacity planning to balance performance and cost.
  • Develop and maintain CI/CD pipelines, observability, monitoring, and incident response processes.

What You'll Bring (Required)

  • Hands-on experience deploying and serving Large Language Models (LLMs) in production.
  • Strong experience with Docker and production Kubernetes environments, including GPU scheduling.
  • Deep understanding of self-hosted AI infrastructure, including model formats, quantization, GPU memory management, batching, and inference optimization.
  • Experience supporting production applications with networking, reverse proxies, load balancing, authentication, and TLS.
  • Proficiency with Linux administration and Python and/or Bash scripting.
  • Ownership mindset with the ability to operate and improve production AI infrastructure.

Nice to Have

  • Experience with CUDA, NVIDIA drivers, GPU Operators, or other GPU infrastructure technologies.
  • Experience with Infrastructure as Code (Terraform, Helm).
  • Familiarity with observability and monitoring tools such as Prometheus and Grafana.
  • Experience building Retrieval-Augmented Generation (RAG) pipelines and working with vector databases (pgvector, Qdrant, Weaviate).
  • Experience with LLM gateway tools such as LiteLLM.

Share This Job

Powered by