feat(backend): add SpiritLM backend for text, TTS, and ASR#8589
feat(backend): add SpiritLM backend for text, TTS, and ASR#8589MkDev11 wants to merge 10 commits intomudler:masterfrom
Conversation
Implements LocalAI backend for Meta Spirit LM (interleaved text and speech). - backend/python/spiritlm: gRPC servicer with LoadModel, Predict, PredictStream, TTS, TTSStream, AudioTranscription, Health - Supports spirit-lm-base-7b and spirit-lm-expressive-7b - Options: sample_rate (default 16000) - backend/index.yaml: add spiritlm meta and capabilities Ref: mudler#3966 Signed-off-by: mkdev11 <MkDev11@users.noreply.github.com>
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
@mudler could you please review the PR and let me know your feedback? |
|
Can you add an entry in gallery/index.yaml similar to Qwen-ASR? |
Add spirit-lm-base-7b and spirit-lm-expressive-7b to model gallery, following the same pattern as Qwen-ASR (per PR mudler#8589 review). Signed-off-by: mkdev11 <MkDev11@users.noreply.github.com>
@richiejp Thanks for your feedback. I added SpiritLM to |
|
Thanks, the bottleneck on our end is testing. If you provide e2e tests then we can verify these and get it merged faster. |
@richiejp I added e2e tests, please review the update again and let me know your feedback. |
|
you still need to add e2e tests |
- Add 'SpiritLM backend e2e' context in core/http/app_test.go (label: spiritlm) with specs: chat completion, TTS, transcription; skip when backend/model not ready - Add make test-spiritlm target; pass SPIRITLM_CHECKPOINTS_DIR when set - Add backend/python/spiritlm/E2E.md with run instructions and full-pass steps - Fix protogen.sh to use repo backend proto path; add backend/python/backend.proto symlink for runProtogen; run.sh executable Ref: mudler#8589, mudler#3966
sorry, just pushed e2e tests |
- Add 'SpiritLM backend e2e' context in core/http/app_test.go (label: spiritlm) with specs: chat completion, TTS, transcription; skip when backend/model not ready - Add make test-spiritlm target; pass SPIRITLM_CHECKPOINTS_DIR when set - Add backend/python/spiritlm/E2E.md with run instructions and full-pass steps - Fix protogen.sh to use repo backend proto path; add backend/python/backend.proto symlink for runProtogen; run.sh executable Ref: mudler#8589, mudler#3966 Signed-off-by: mkdev11 <MkDev11@users.noreply.github.com>
- Add tests/e2e/spiritlm_e2e_test.go with chat, TTS, and transcription specs - Register spiritlm mock backend and spirit-lm-base-7b model in e2e_suite_test.go - Add make test-e2e-spiritlm target; fix protogen-go PATH for protoc plugins - Update backend/python/spiritlm/E2E.md with tests/e2e coverage and run instructions Signed-off-by: mkdev11 <MkDev11@users.noreply.github.com>
87ef862 to
bdc4829
Compare
|
Hi @richiejp Can you please let me know what I need to update or change? |
Description
Fixes #3966
Adds a new LocalAI backend for Meta Spirit LM: an interleaved text and speech model that supports text generation, text-to-speech (TTS), and automatic speech recognition (ASR) in a single 7B model.
Changes:
LoadModel: loadsspirit-lm-base-7borspirit-lm-expressive-7bPredict/PredictStream: text generation viaOutputModality.TEXTTTS/TTSStream: text → speech viaOutputModality.SPEECH(float32 → 16 kHz WAV)AudioTranscription: speech → text viaOutputModality.TEXTfrom audio path (request.dst)Health, options parsing (sample_rate, etc.)&spiritlmmeta with description, tags (text-to-text, TTS, ASR, LLM, multimodal), capabilities (cpu-spiritlm, cuda12-spiritlm).Notes for Reviewers
requirements-install.txtinstalls fromgit+https://github.com/facebookresearch/spiritlm.git. Checkpoints must be set up separately per the SpiritLM repo.backend.protointo the backend dir per existing Dockerfile.python.fair-noncommercial(Meta FAIR Noncommercial Research License).Signed commits