publications
Publications by categories in reversed chronological order. * represents co-first author.
2024
- NeurIPSCommunication Bounds for the Distributed Experts ProblemIn The Thirty-eighth Annual Conference on Neural Information Processing Systems , 2024
2023
- arXivTowards efficient generative large language model serving: A survey from algorithms to systemsarXiv preprint arXiv:2312.15234, 2023
2021
2020
- arXiv