[SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference
-
Updated
Nov 17, 2023 - Verilog
[SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference
[Long Term Support] [SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference
The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"
Official repo for the ACSOS 2021 paper on how to manage many deep learning models at the edge!
Local inference serving with adaptive batching, benchmark sweeps, and regression gates for batching tradeoffs.
Machine-readable companion to the IEEE OJ-CS survey 'Semantic Caching and Response Reuse for Large Language Model Services: A Survey' (Chukkapalli, Mishra, Naik, 2026): 21-work evidence matrix, systematic-search log, proposed benchmark trace schema, stdlib-only contract validator, and CPU pilot. Code MIT; data CC-BY-4.0.
Add a description, image, and links to the inference-serving topic page so that developers can more easily learn about it.
To associate your repository with the inference-serving topic, visit your repo's landing page and select "manage topics."