Open Role

AI Silicon Systems Software and Architecture Intern

Work across AI accelerator benchmarks, simulators, compiler/runtime prototypes, operator mapping, and real-system validation.

Back to Careers

Work across AI accelerator benchmarks, simulators, compiler/runtime prototypes, operator mapping, and real-system validation.

Responsibilities

  • Support AI accelerator benchmark design, evaluation-flow setup, and result analysis.
  • Profile and analyze representative workloads such as LLM inference to identify bottlenecks.
  • Design microbenchmarks to study memory access, utilization, latency, and related behavior.
  • Help build simulator environments, configuration management, regression tests, trace parsers, and experiment frameworks.
  • Explore compiler, runtime, operator mapping, scheduling, memory layout, tiling, and dataflow prototypes.
  • After hardware is available, support bring-up, functional validation, performance testing, and simulator-to-silicon comparison.

Requirements

  • Major in computer science, electronic engineering, automation, software engineering, integrated circuits, AI, or a related field; graduate students preferred, strong undergraduates welcome.
  • Solid computer-science fundamentals and strong interest or experience in at least one of computer architecture, compilers/systems software, parallel or high-performance computing, AI inference systems, model deployment, or performance optimization.
  • Good C/C++ or Python programming skills.
  • Strong analytical, experimental, and technical-writing ability.
  • Self-driven learner who can move quickly in early-stage projects with uncertainty.

Nice to have

  • Coursework or deep study in computer architecture, parallel architecture, storage systems, or compilers.
  • Experience with benchmarking, profiling, trace analysis, or performance modeling.
  • Familiarity with PyTorch, CUDA, TVM, MLIR, LLVM, Triton, ONNX Runtime, vLLM, or related tools.
  • Experience with simulators, microbenchmarks, runtime systems, or kernel optimization.
  • Practical understanding of LLM inference, KV Cache, attention, and memory hierarchy.
  • Research experience, systems projects, or open-source contributions.