This is a living document. We update it as priorities shift based on community feedback and production learnings. If something here excites you, open an issue or PR — we move fast on contributions.
- Python SDK (
moss) — async-first, type-safe - TypeScript SDK (
@moss-dev/moss) — full feature parity with Python - Built-in embedding models (
moss-minilm) - Hybrid search — combine semantic search with BM25 keyword matching
- Custom embedding support (bring your own OpenAI, Cohere, etc.)
- Metadata filtering (
$eq,$and,$in,$near) - Document management (add, upsert, get, delete)
- LangChain integration
- DSPy integration
- Pipecat voice agent integration
- LiveKit voice agent integration
- Next.js example app
- VitePress search plugin
- Docker deployment examples (ECS/K8s patterns)
- WebAssembly runtime — client-side semantic search in the browser, no server required
- Benchmarks directory — reproducible latency/throughput scripts comparing Moss vs Pinecone, Qdrant, and Chroma on standardized datasets
- MCP server — expose Moss as a Model Context Protocol server so any MCP-compatible AI tool (Claude, Cursor, Windsurf) can do semantic search
- Vercel AI SDK integration — retrieval provider for the Vercel AI SDK
- Ollama + Moss + Pipecat reference architecture — an end-to-end local LLM voice agent: Ollama for LLM inference, Moss for retrieval, Pipecat for real-time audio. A single
docker compose upto run the entire stack.
- CrewAI integration — Moss as a retrieval tool for CrewAI agents
- Haystack integration — document store / retriever integration
- Reranking support — plug in cross-encoder rerankers as a post-retrieval step
These are well-scoped and ready for contributors. Each one has (or will have) a corresponding GitHub issue with detailed instructions.
- Swift bindings — for iOS/macOS apps with on-device retrieval (
good first issue) - Go bindings — for backend services and CLI tools (
good first issue) - Elixir bindings — for Phoenix/LiveView apps (
good first issue) - Rust bindings — for performance-critical pipelines (
good first issue) - Kotlin bindings — for Android apps and Spring Boot backend services (
good first issue)
- AutoGen — retrieval-augmented tool for AutoGen agents
- LlamaIndex — retriever and query engine integration
- Semantic Kernel — .NET/Python retrieval plugin
- LangGraph — retrieval node for stateful multi-agent workflows
- Google ADK — Moss as a retrieval tool for Google's Agent Development Kit
- OpenAI Agents SDK — Moss as a tool for the OpenAI agents framework
- Smolagents — lightweight retrieval tool for Hugging Face's agent framework
- Vapi integration — Moss retrieval tool for Vapi voice agents
- Daily.co integration — real-time audio pipeline with semantic context injection
- Twilio integration — retrieval for phone-based AI agents (IVR, call center bots)
- Moss CLI — manage indexes, run queries, import data, and inspect results from the terminal (
moss index create,moss query,moss import) - VS Code extension — semantic search over your codebase directly from the editor sidebar
- Multi-vector retrieval — support ColBERT-style late interaction models
- Doc-parsing connectors — ingest PDF, DOCX, HTML, and Markdown files directly into Moss indexes
- Chunking strategies — built-in text splitters (sentence, paragraph, recursive, semantic)
- Web crawling — crawl a URL and index the content
These are bigger bets we're exploring. They're directional, not committed — community input will shape what gets built.
- vLLM-based local inference + local search — a fully local pipeline: your model, your embeddings, your search, your hardware. No API calls. This is a natural fit for the privacy-first voice AI use case and can meaningfully cut latency for on-premise deployments.
- LLM-as-a-judge evaluation framework — automated retrieval quality scoring using LLM judges. We want to lay the foundation and let the community decide the direction — what metrics matter, which judges to support, how to benchmark fairly.
- Retrieval quality dashboard — visualize query performance, relevance scores, and failure modes over time
- Edge runtime support — run Moss in Cloudflare Workers, Deno Deploy, and Vercel Edge Functions
- Query expansion — LLM-powered query rewriting to improve recall on short or ambiguous queries
- Sparse-dense fusion (SPLADE) — learned sparse retrieval to complement BM25 hybrid, improving precision on rare terms
- Contextual retrieval — pre-chunking contextualization to make every chunk self-contained and more retrievable
Connect knowledge sources to Moss without writing custom ETL.
- GitHub connector — index code, issues, PRs, and docs from repositories
- Notion connector — sync and index Notion workspace pages
- Confluence connector — enterprise knowledge base indexing
- S3/GCS sync — auto-index documents from cloud storage buckets on upload
- Pick something from "Next Up" — these are ready for PRs
- Check the issues — look for
good first issueandhelp wantedlabels - Propose something new — open an issue describing what you want to build. We're open to ideas that aren't on this list.
- Read the Contributing Guide — fork, branch from
main, PR
If you're unsure where to start, drop a message in Discord and we'll point you in the right direction.