inference-serving

Star

Here are 6 public repositories matching this topic...

hipersys-team / lightning

Star

[SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference

machine-learning hardware verilog fpga-soc smartnic photonic-computing rfsoc inference-serving

Updated Nov 17, 2023
Verilog

open-photonics / lightning-lts

Star

[Long Term Support] [SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference

verilog fpga-soc smartnic machine-learning-systems photonic-computing rfsoc inference-serving in-network-computing

Updated Sep 20, 2024
Verilog

KevinLee1110 / dynamic-batching

Star

The official repo for the paper "Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching"

inference-serving llm vllm

Updated Mar 17, 2025

cake-lab / CremeBrulee

Star

Official repo for the ACSOS 2021 paper on how to manage many deep learning models at the edge!

deep-learning model-caching inference-serving

Updated Sep 29, 2021
Python

GoparapukethaN / streaminfer

Star

Local inference serving with adaptive batching, benchmark sweeps, and regression gates for batching tradeoffs.

python benchmark machine-learning real-time latency websocket load-testing inference regression-testing performance-testing model-serving quality-gates mlops fastapi inference-serving llm-inference adaptive-batching

Updated May 20, 2026
Python

dchukkapalli-dev / semantic-caching-llm-companion

Star

Machine-readable companion to the IEEE OJ-CS survey 'Semantic Caching and Response Reuse for Large Language Model Services: A Survey' (Chukkapalli, Mishra, Naik, 2026): 21-work evidence matrix, systematic-search log, proposed benchmark trace schema, stdlib-only contract validator, and CPU pilot. Code MIT; data CC-BY-4.0.

benchmark survey prisma inference-serving llm semantic-caching response-reuse cache-correctness

Updated Jun 5, 2026
Python

Improve this page

Add a description, image, and links to the inference-serving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference-serving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference-serving

Here are 6 public repositories matching this topic...

hipersys-team / lightning

open-photonics / lightning-lts

KevinLee1110 / dynamic-batching

cake-lab / CremeBrulee

GoparapukethaN / streaminfer

dchukkapalli-dev / semantic-caching-llm-companion

Improve this page

Add this topic to your repo