
TL;DR
Meetily was designed from the start with a pluggable model interface - local or cloud, your choice. After users reported that smaller local models hallucinate and miss action items, we noticed power users preferred cloud models for summarization while keeping transcription local. Here's what we learned about the trade-offs and how we evaluated each provider's privacy policy.
When we started building Meetily, our north star was privacy. No cloud uploads. No meeting bots. Everything runs on your machine.
We assumed the AI models had caught up. Whisper for transcription. Local LLMs for summarization. Simple.
We were partially right about transcription. But summarization? That's where things got complicated.
The Promise We Made
Meetily was built on a clear principle: 100% local processing. Your meetings contain sensitive information - business strategies, client details, personnel discussions. We believed (and still believe) that you shouldn't have to trust a third party with this data.
So we shipped the Community Edition with local-only AI:
- Transcription: Whisper.cpp running entirely on your device
- Summarization: Local LLMs via Ollama or our bundled Gemma models
It worked. But "works" and "works well" are different things.
What We Heard From Users
Within weeks of our release, the feedback started coming in. The transcription was solid - Whisper is genuinely impressive. But the summaries? Users reported:
- Hallucinations - The AI would confidently state things that were never said
- Missing action items - Key tasks assigned in the meeting didn't appear in the summary
- Missing important points - Major decisions glossed over or omitted entirely
- Context issues - The AI would misunderstand who said what, or conflate different topics
For a product that promises to capture your meetings accurately, this was a problem.
Why Local Models Struggle
We tested several local models:
| Model | Parameters | Performance | Issues |
|---|---|---|---|
| Gemma 1B | 1 billion | Fast, low memory | Frequent hallucinations, poor context retention |
| Gemma 4B | 4 billion | Reasonable speed | Still misses key action items |
| Mistral 7B | 7 billion | Good quality when it works | Slow, sometimes fails on 24GB Mac |
The issue isn't that local models are bad. It's that meeting transcripts present a specific challenge:
- Long context - A 1-hour meeting might generate 10,000+ words of transcript
- Implicit context - "Let's do what we discussed last time" requires understanding what wasn't said
- Speaker attribution - Who said what matters enormously for action items
- Domain knowledge - Technical jargon, company-specific terms, acronyms
Smaller models (under 7B parameters) simply don't have the capacity to handle all of this reliably. And larger models that can? They need serious hardware - we're talking 32GB+ RAM, ideally with GPU acceleration.
Most users don't have that. And even those who do often found the processing time unacceptable.
Our Transcription Story Is Different
For transcription, local models work excellently. We offer two engines:
- Whisper (via whisper-rs): High accuracy across 99+ languages. The large-v3 model is the gold standard.
- Parakeet: Faster with excellent accuracy for English and French specifically.
Users can choose between them in the app. For transcription, local-first works. Summarization is where the gap appears.
How Meetily's Pluggable Architecture Helps
From day one, Meetily was designed with a pluggable model interface. Users can connect:
- Local models: Ollama, bundled Gemma, any OpenAI-compatible local endpoint
- Cloud models: Claude, OpenAI, Groq, or any OpenAI-compatible API
- Hosted AI: Our own hosted option. No API keys, no setup. Just works. Pro users get generous free credits, and trial users get 10 free credits to try it out.
This wasn't a pivot - it was always the architecture. What we learned from user feedback is that power users who could run both options typically preferred cloud models for summarization quality, while keeping transcription fully local.
We also heard from users who just wanted something that worked out of the box without setting up API keys or running local models. That's why we added Hosted AI. You don't need to configure anything. Start a meeting, get a summary. If you don't like it, switch to local or BYOK anytime.
Some users genuinely cannot use cloud services. Healthcare providers with PHI. Legal teams with privileged communications. Security researchers. For them, the fully local path works well.
For users whose meetings don't contain regulated data, a cloud model with a clear privacy policy is often the right trade-off for better accuracy. The point is: you get to choose.
The Providers We Support (And Their Privacy Policies)
When we decided to add cloud options, we didn't just pick the most popular providers. We read every privacy policy, data processing agreement, and terms of service. Here's what we found:
Anthropic Claude API
| Aspect | Policy |
|---|---|
| Data retention | 7-30 days for abuse monitoring |
| Training on your data | No. API data is NOT used for training |
| Zero Data Retention | Available for enterprise customers |
| Our take | Strong privacy stance for a commercial API |
Claude consistently produces the highest quality meeting summaries in our testing. Anthropic has been clear: API inputs and outputs are not used for model training. After September 2025, they reduced retention from 30 days to 7 days.
For users who can accept cloud processing, Claude is our recommended option for accuracy.
Source: Anthropic Privacy Center(opens in new tab)
Groq API
| Aspect | Policy |
|---|---|
| Data retention | NOT retained by default |
| Zero Data Retention | Can enable ZDR in Data Controls settings |
| Training on your data | No explicit statement, but implies no training |
| Our take | Privacy-friendly with explicit ZDR option |
Groq offers a variety of models (Llama, Mixtral, Gemma, and others) at extremely fast inference speeds via their custom LPU hardware. Their default is to not retain customer data, and you can explicitly enable Zero Data Retention.
For users who want cloud speed without prolonged data storage, Groq is a solid choice.
Source: Groq Privacy Policy(opens in new tab), GroqCloud Data Documentation(opens in new tab)
OpenAI API
| Aspect | Policy |
|---|---|
| Data retention | 30 days for abuse monitoring |
| Training on your data | No - API/business data NOT used unless you opt in |
| Zero Data Retention | Available for qualifying businesses |
| Compliance | SOC 2 Type 2 certified, HIPAA BAA available |
| Our take | Clear policies, strong compliance certifications |
OpenAI's API has a well-documented privacy stance: they do not train on API data unless you explicitly opt in. Zero Data Retention is available for enterprise customers.
Source: OpenAI Enterprise Privacy(opens in new tab), OpenAI Business Data(opens in new tab)
Google Gemini API
| Aspect | Policy |
|---|---|
| Data retention | Free tier: 55 days. Paid tier: 30 days |
| Training on your data | Free tier: YES. Paid/Enterprise: NO |
| Our take | Be careful with tier - free tier trains on your data |
Gemini is where you need to read the fine print. Free tier users have their data retained for 55 days AND used for model training. Paid and enterprise customers get shorter 30-day retention and are excluded from training.
If you use Gemini through Meetily, understand which tier you're on.
Source: Gemini API Terms of Service(opens in new tab)
OpenWebUI (Self-Hosted)
| Aspect | Policy |
|---|---|
| Data retention | You control it - runs on your infrastructure |
| Training on your data | No - chat history not used for training |
| Our take | Best privacy option if you can self-host |
OpenWebUI is a self-hosted interface for running LLMs. When configured properly, it gives you the best of both worlds: better models than you can run locally, with complete data control.
For privacy-maximizing users who have the technical capability, this is the gold standard.
Source: OpenWebUI(opens in new tab)
Local Ollama Models
| Aspect | Policy |
|---|---|
| Data retention | N/A - runs entirely locally |
| Training on your data | No - never leaves your device |
| Our take | Maximum privacy, variable quality |
For users who cannot accept any cloud processing, Ollama remains available. We bundle Gemma models for users who don't want to install Ollama separately.
Quality varies by model and hardware. If you have a powerful machine (32GB+ RAM, GPU), Mistral 7B or larger models can produce good results. On modest hardware, expect trade-offs.
Our Current Recommendation
Based on our testing and privacy research:
| Scenario | Recommended Provider |
|---|---|
| Regulated data (HIPAA, legal) | Local Ollama or OpenWebUI self-hosted |
| Maximum privacy, non-regulated | Groq with ZDR enabled |
| Best quality, privacy-conscious | Anthropic Claude API |
| Just want it to work | Hosted AI (free credits included with Pro) |
| Enterprise with compliance needs | OpenAI with BAA or Claude Enterprise |
| Budget-conscious | Local Gemma (bundled) or Groq free tier |
What We're Still Figuring Out
This isn't a "we solved it" post. We're still exploring:
- Fine-tuning local models - Can we train a smaller model specifically for meeting summarization?
- Hybrid approaches - Local processing for sensitive sections, cloud for general summary?
- Better prompting - Are there prompt engineering techniques that improve local model output?
- Newer models - The local AI landscape evolves rapidly. Today's limitation might be solved in 6 months.
We committed to building a privacy-first meeting assistant. That commitment doesn't mean we pretend local models are perfect when they're not. It means we're honest about trade-offs, transparent about provider policies, and committed to improving.
The Transcription Side: A Brighter Story
While summarization has been challenging, transcription is working well locally. We offer two engines:
Whisper (via whisper-rs)
- Models: tiny, base, small, medium, large-v3
- Languages: 99+ supported
- Quality: High accuracy with large-v3
- Hardware: Scales with model size; large-v3 needs ~3GB VRAM
Parakeet
- Optimized for: English and French
- Quality: Excellent for supported languages
- Speed: Faster than equivalent Whisper models
- Best for: Users who primarily have English/French meetings
Both run entirely locally. Both work offline after initial model download. For transcription, the local-first approach works.
What About Accuracy Claims?
You'll notice we don't claim specific accuracy percentages like "95% accurate." That's intentional. Accuracy varies dramatically by:
- Audio quality
- Number of speakers
- Accents and dialects
- Background noise
- Technical jargon
We say "high accuracy" because that's honest. Anyone claiming a universal percentage is oversimplifying.
The Trade-offs Are Real
"Local-first" doesn't automatically mean "best." Sometimes it does. Sometimes it means trading quality for privacy. Here's how we think about it:
- For transcription: Local models (Whisper, Parakeet) work excellently. No reason to go cloud.
- For summarization: Quality gap is real. Smaller local models hallucinate more. Larger models need serious hardware.
- Your choice: Meetily lets you pick - fully local, fully cloud, or mix and match per use case.
What This Means For You
If you're using Meetily:
- Community Edition offers fully local processing
- Pro Edition includes Hosted AI with free credits, plus cloud model integrations
- You choose which provider (if any) to use for summarization
- Privacy policies above help you evaluate the trade-offs
The architecture was always designed for choice. Use what works for your situation.
Frequently Asked Questions
Key Takeaways
- 1Local transcription (Whisper/Parakeet) works well - we recommend it for all users
- 2Local summarization has trade-offs: smaller models hallucinate, larger models need serious hardware
- 3We now offer Hosted AI (no setup) and cloud summarization with vetted providers
- 4Anthropic Claude, Groq, and OpenAI don't train on API data
- 5Google Gemini free tier DOES train on your data - be aware of your tier
- 6Self-hosted OpenWebUI gives cloud quality with full data control
- 7You always choose - fully local remains available in Community Edition
Try It Yourself
Meetily Community Edition is free and open source. Pro includes Hosted AI with free credits for users who just want great summaries without setup. Or bring your own API key. Or go fully local. Your call.
Whether you go fully local, use Hosted AI, or connect your own cloud provider, you're in control.
Download Meetily
Privacy-first meeting transcription. Local, Hosted AI, or bring your own API key.
Download FreeThis post reflects our understanding as of February 2026. Privacy policies change. We'll update this post if providers make significant changes. If you spot anything outdated, email us at meetily@zackriya.com.
Ready to try Meetily?
Join 76,000+ users who use Meetily for private meeting transcription. No bots, privacy first. Community Edition free.
Star on GitHub (9.9k+) · Open source & self-hostable
Get Started with Meetily
Meetily Pro
Advanced features for individuals and teams.


