publications
Publications by categories in reversed chronological order. * represents co-first author.
2024
2023
- arXivTowards efficient generative large language model serving: A survey from algorithms to systemsarXiv preprint arXiv:2312.15234, 2023
2021
2020
- arXiv