Our Quest for Meeting Summary Accuracy: What We Learned Building Meetily

12 min readBuild in PublicEnglish
Meetily's journey to meeting summary accuracy - local AI vs cloud models comparison

TL;DR

Meetily was designed from the start with a pluggable model interface - local or cloud, your choice. After users reported that smaller local models hallucinate and miss action items, we noticed power users preferred cloud models for summarization while keeping transcription local. Here's what we learned about the trade-offs and how we evaluated each provider's privacy policy.

When we started building Meetily, our north star was privacy. No cloud uploads. No meeting bots. Everything runs on your machine.

We assumed the AI models had caught up. Whisper for transcription. Local LLMs for summarization. Simple.

We were partially right about transcription. But summarization? That's where things got complicated.

The Promise We Made

Meetily was built on a clear principle: 100% local processing. Your meetings contain sensitive information - business strategies, client details, personnel discussions. We believed (and still believe) that you shouldn't have to trust a third party with this data.

So we shipped the Community Edition with local-only AI:

  • Transcription: Whisper.cpp running entirely on your device
  • Summarization: Local LLMs via Ollama or our bundled Gemma models

It worked. But "works" and "works well" are different things.

What We Heard From Users

Within weeks of our release, the feedback started coming in. The transcription was solid - Whisper is genuinely impressive. But the summaries? Users reported:

  • Hallucinations - The AI would confidently state things that were never said
  • Missing action items - Key tasks assigned in the meeting didn't appear in the summary
  • Missing important points - Major decisions glossed over or omitted entirely
  • Context issues - The AI would misunderstand who said what, or conflate different topics

For a product that promises to capture your meetings accurately, this was a problem.

Why Local Models Struggle

We tested several local models:

ModelParametersPerformanceIssues
Gemma 1B1 billionFast, low memoryFrequent hallucinations, poor context retention
Gemma 4B4 billionReasonable speedStill misses key action items
Mistral 7B7 billionGood quality when it worksSlow, sometimes fails on 24GB Mac

The issue isn't that local models are bad. It's that meeting transcripts present a specific challenge:

  1. Long context - A 1-hour meeting might generate 10,000+ words of transcript
  2. Implicit context - "Let's do what we discussed last time" requires understanding what wasn't said
  3. Speaker attribution - Who said what matters enormously for action items
  4. Domain knowledge - Technical jargon, company-specific terms, acronyms

Smaller models (under 7B parameters) simply don't have the capacity to handle all of this reliably. And larger models that can? They need serious hardware - we're talking 32GB+ RAM, ideally with GPU acceleration.

Most users don't have that. And even those who do often found the processing time unacceptable.

Our Transcription Story Is Different

For transcription, local models work excellently. We offer two engines:

  • Whisper (via whisper-rs): High accuracy across 99+ languages. The large-v3 model is the gold standard.
  • Parakeet: Faster with excellent accuracy for English and French specifically.

Users can choose between them in the app. For transcription, local-first works. Summarization is where the gap appears.

How Meetily's Pluggable Architecture Helps

From day one, Meetily was designed with a pluggable model interface. Users can connect:

  • Local models: Ollama, bundled Gemma, any OpenAI-compatible local endpoint
  • Cloud models: Claude, OpenAI, Groq, or any OpenAI-compatible API
  • Hosted AI: Our own hosted option. No API keys, no setup. Just works. Pro users get generous free credits, and trial users get 10 free credits to try it out.

This wasn't a pivot - it was always the architecture. What we learned from user feedback is that power users who could run both options typically preferred cloud models for summarization quality, while keeping transcription fully local.

We also heard from users who just wanted something that worked out of the box without setting up API keys or running local models. That's why we added Hosted AI. You don't need to configure anything. Start a meeting, get a summary. If you don't like it, switch to local or BYOK anytime.

Some users genuinely cannot use cloud services. Healthcare providers with PHI. Legal teams with privileged communications. Security researchers. For them, the fully local path works well.

For users whose meetings don't contain regulated data, a cloud model with a clear privacy policy is often the right trade-off for better accuracy. The point is: you get to choose.

The Providers We Support (And Their Privacy Policies)

When we decided to add cloud options, we didn't just pick the most popular providers. We read every privacy policy, data processing agreement, and terms of service. Here's what we found:

Anthropic Claude API

AspectPolicy
Data retention7-30 days for abuse monitoring
Training on your dataNo. API data is NOT used for training
Zero Data RetentionAvailable for enterprise customers
Our takeStrong privacy stance for a commercial API

Claude consistently produces the highest quality meeting summaries in our testing. Anthropic has been clear: API inputs and outputs are not used for model training. After September 2025, they reduced retention from 30 days to 7 days.

For users who can accept cloud processing, Claude is our recommended option for accuracy.

Source: Anthropic Privacy Center(opens in new tab)

Groq API

AspectPolicy
Data retentionNOT retained by default
Zero Data RetentionCan enable ZDR in Data Controls settings
Training on your dataNo explicit statement, but implies no training
Our takePrivacy-friendly with explicit ZDR option

Groq offers a variety of models (Llama, Mixtral, Gemma, and others) at extremely fast inference speeds via their custom LPU hardware. Their default is to not retain customer data, and you can explicitly enable Zero Data Retention.

For users who want cloud speed without prolonged data storage, Groq is a solid choice.

Source: Groq Privacy Policy(opens in new tab), GroqCloud Data Documentation(opens in new tab)

OpenAI API

AspectPolicy
Data retention30 days for abuse monitoring
Training on your dataNo - API/business data NOT used unless you opt in
Zero Data RetentionAvailable for qualifying businesses
ComplianceSOC 2 Type 2 certified, HIPAA BAA available
Our takeClear policies, strong compliance certifications

OpenAI's API has a well-documented privacy stance: they do not train on API data unless you explicitly opt in. Zero Data Retention is available for enterprise customers.

Source: OpenAI Enterprise Privacy(opens in new tab), OpenAI Business Data(opens in new tab)

Google Gemini API

AspectPolicy
Data retentionFree tier: 55 days. Paid tier: 30 days
Training on your dataFree tier: YES. Paid/Enterprise: NO
Our takeBe careful with tier - free tier trains on your data

Gemini is where you need to read the fine print. Free tier users have their data retained for 55 days AND used for model training. Paid and enterprise customers get shorter 30-day retention and are excluded from training.

If you use Gemini through Meetily, understand which tier you're on.

Source: Gemini API Terms of Service(opens in new tab)

OpenWebUI (Self-Hosted)

AspectPolicy
Data retentionYou control it - runs on your infrastructure
Training on your dataNo - chat history not used for training
Our takeBest privacy option if you can self-host

OpenWebUI is a self-hosted interface for running LLMs. When configured properly, it gives you the best of both worlds: better models than you can run locally, with complete data control.

For privacy-maximizing users who have the technical capability, this is the gold standard.

Source: OpenWebUI(opens in new tab)

Local Ollama Models

AspectPolicy
Data retentionN/A - runs entirely locally
Training on your dataNo - never leaves your device
Our takeMaximum privacy, variable quality

For users who cannot accept any cloud processing, Ollama remains available. We bundle Gemma models for users who don't want to install Ollama separately.

Quality varies by model and hardware. If you have a powerful machine (32GB+ RAM, GPU), Mistral 7B or larger models can produce good results. On modest hardware, expect trade-offs.

Our Current Recommendation

Based on our testing and privacy research:

ScenarioRecommended Provider
Regulated data (HIPAA, legal)Local Ollama or OpenWebUI self-hosted
Maximum privacy, non-regulatedGroq with ZDR enabled
Best quality, privacy-consciousAnthropic Claude API
Just want it to workHosted AI (free credits included with Pro)
Enterprise with compliance needsOpenAI with BAA or Claude Enterprise
Budget-consciousLocal Gemma (bundled) or Groq free tier

What We're Still Figuring Out

This isn't a "we solved it" post. We're still exploring:

  1. Fine-tuning local models - Can we train a smaller model specifically for meeting summarization?
  2. Hybrid approaches - Local processing for sensitive sections, cloud for general summary?
  3. Better prompting - Are there prompt engineering techniques that improve local model output?
  4. Newer models - The local AI landscape evolves rapidly. Today's limitation might be solved in 6 months.

We committed to building a privacy-first meeting assistant. That commitment doesn't mean we pretend local models are perfect when they're not. It means we're honest about trade-offs, transparent about provider policies, and committed to improving.

The Transcription Side: A Brighter Story

While summarization has been challenging, transcription is working well locally. We offer two engines:

Whisper (via whisper-rs)

  • Models: tiny, base, small, medium, large-v3
  • Languages: 99+ supported
  • Quality: High accuracy with large-v3
  • Hardware: Scales with model size; large-v3 needs ~3GB VRAM

Parakeet

  • Optimized for: English and French
  • Quality: Excellent for supported languages
  • Speed: Faster than equivalent Whisper models
  • Best for: Users who primarily have English/French meetings

Both run entirely locally. Both work offline after initial model download. For transcription, the local-first approach works.

What About Accuracy Claims?

You'll notice we don't claim specific accuracy percentages like "95% accurate." That's intentional. Accuracy varies dramatically by:

  • Audio quality
  • Number of speakers
  • Accents and dialects
  • Background noise
  • Technical jargon

We say "high accuracy" because that's honest. Anyone claiming a universal percentage is oversimplifying.

The Trade-offs Are Real

"Local-first" doesn't automatically mean "best." Sometimes it does. Sometimes it means trading quality for privacy. Here's how we think about it:

  • For transcription: Local models (Whisper, Parakeet) work excellently. No reason to go cloud.
  • For summarization: Quality gap is real. Smaller local models hallucinate more. Larger models need serious hardware.
  • Your choice: Meetily lets you pick - fully local, fully cloud, or mix and match per use case.

What This Means For You

If you're using Meetily:

  • Community Edition offers fully local processing
  • Pro Edition includes Hosted AI with free credits, plus cloud model integrations
  • You choose which provider (if any) to use for summarization
  • Privacy policies above help you evaluate the trade-offs

The architecture was always designed for choice. Use what works for your situation.

Frequently Asked Questions

Yes. Meetily Community Edition runs 100% locally. You can use Ollama or our bundled Gemma models for summarization. No cloud connection required after initial setup. The quality trade-offs we described apply, but for users who need complete data sovereignty, the local path remains available.
For pure privacy, Groq with Zero Data Retention enabled is strong - they don't retain data by default. Anthropic Claude and OpenAI both don't train on API data. Google Gemini's free tier does train on your data, so be careful there. For maximum control, self-hosted OpenWebUI is best.
We're actively tracking this space. Local models improve rapidly. But today, users need working solutions. We'd rather give you options with transparency than make you wait indefinitely or use subpar quality in the meantime.
Only the text transcript - never the audio. You can also choose to only send portions of the transcript. The raw meeting recording always stays on your device.
When you bring your own API key, you pay the provider directly. We don't mark up costs or take a cut. Hosted AI is included with Pro (generous free credits). We handle the infrastructure so you don't have to set anything up.

Key Takeaways

  • 1Local transcription (Whisper/Parakeet) works well - we recommend it for all users
  • 2Local summarization has trade-offs: smaller models hallucinate, larger models need serious hardware
  • 3We now offer Hosted AI (no setup) and cloud summarization with vetted providers
  • 4Anthropic Claude, Groq, and OpenAI don't train on API data
  • 5Google Gemini free tier DOES train on your data - be aware of your tier
  • 6Self-hosted OpenWebUI gives cloud quality with full data control
  • 7You always choose - fully local remains available in Community Edition

Try It Yourself

Meetily Community Edition is free and open source. Pro includes Hosted AI with free credits for users who just want great summaries without setup. Or bring your own API key. Or go fully local. Your call.

Whether you go fully local, use Hosted AI, or connect your own cloud provider, you're in control.

Download Meetily

Privacy-first meeting transcription. Local, Hosted AI, or bring your own API key.

Download Free

This post reflects our understanding as of February 2026. Privacy policies change. We'll update this post if providers make significant changes. If you spot anything outdated, email us at meetily@zackriya.com.

About the Author

S

Sujith @ Zackriya Solutions

Founder of Meetily. Building privacy-first AI tools and being honest about what works and what doesn't. Follow our journey on GitHub.

Ready to try Meetily?

Join 76,000+ users who use Meetily for private meeting transcription. No bots, privacy first. Community Edition free.

No meeting bots
100% local transcription
Free & open source
Download Free

Star on GitHub (9.9k+) · Open source & self-hostable

Get Started with Meetily

Meetily Pro

Advanced features for individuals and teams.

Download

Get Meetily for Mac or Windows. Free and open source.

Download

Recent Articles