Nikhil Pratap Ghanathe

Project Description

Improved speculative decoding on edge GPUs through architectural bias to induce n-gram statistics in draft model
– Distilled drafters learn n-gram patterns of target models, leading to improved token acceptance rate

Research Classification

  • Electrical engineering, computer engineering, and information engineering

Research Interests

  • Efficient AI
  • TinyML
  • Sustainable Computing

Faculty

Faculty of Applied Science