Our Quest for Meeting Summary Accuracy: What We Learned Building Meetily

Local AI models for meeting summaries have real trade-offs. Here's what we learned about hallucinations, missing action items, and why Meetily's pluggable architecture lets you choose between local and cloud models.

Sujith SCTO, Product Owner, Zackriya Solutions

February 6, 202612 min readBuild in PublicEnglish

Meetily's journey to meeting summary accuracy - local AI vs cloud models comparison

TL;DR

Meetily was designed from the start with a pluggable model interface - local or cloud, your choice. After users reported that smaller local models hallucinate and miss action items, we noticed power users preferred cloud models for summarization while keeping transcription local. Here's what we learned about the trade-offs and how we evaluated each provider's privacy policy.

When we started building Meetily, our north star was privacy. No cloud uploads. No meeting bots. Everything runs on your machine.

We assumed the AI models had caught up. Whisper for transcription. Local LLMs for summarization. Simple.

We were partially right about transcription. But summarization? That's where things got complicated.

The Promise We Made

Meetily was built on a clear principle: 100% local processing. Your meetings contain sensitive information - business strategies, client details, personnel discussions. We believed (and still believe) that you shouldn't have to trust a third party with this data.

So we shipped the Community Edition with local-only AI:

Transcription: Whisper.cpp running entirely on your device
Summarization: Local LLMs via Ollama or our bundled Gemma models

It worked. But "works" and "works well" are different things.

What We Heard From Users

Within weeks of our release, the feedback started coming in. The transcription was solid - Whisper is genuinely impressive. But the summaries? Users reported:

Hallucinations - The AI would confidently state things that were never said
Missing action items - Key tasks assigned in the meeting didn't appear in the summary
Missing important points - Major decisions glossed over or omitted entirely
Context issues - The AI would misunderstand who said what, or conflate different topics

For a product that promises to capture your meetings accurately, this was a problem.

Why Local Models Struggle

We tested several local models:

Model	Parameters	Performance	Issues
Gemma 1B	1 billion	Fast, low memory	Frequent hallucinations, poor context retention
Gemma 4B	4 billion	Reasonable speed	Still misses key action items
Mistral 7B	7 billion	Good quality when it works	Slow, sometimes fails on 24GB Mac

The issue isn't that local models are bad. It's that meeting transcripts present a specific challenge:

Long context - A 1-hour meeting might generate 10,000+ words of transcript
Implicit context - "Let's do what we discussed last time" requires understanding what wasn't said
Speaker attribution - Who said what matters enormously for action items
Domain knowledge - Technical jargon, company-specific terms, acronyms

Smaller models (under 7B parameters) simply don't have the capacity to handle all of this reliably. And larger models that can? They need serious hardware - we're talking 32GB+ RAM, ideally with GPU acceleration.

Most users don't have that. And even those who do often found the processing time unacceptable.

Our Transcription Story Is Different

For transcription, local models work excellently. We offer two engines:

Whisper (via whisper-rs): High accuracy across 99+ languages. The large-v3 model is the gold standard.
Parakeet: Faster with excellent accuracy for English and French specifically.

Users can choose between them in the app. For transcription, local-first works. Summarization is where the gap appears.

How Meetily's Pluggable Architecture Helps

From day one, Meetily was designed with a pluggable model interface. Users can connect:

Local models: Ollama, bundled Gemma, any OpenAI-compatible local endpoint
Cloud models (BYOK): bring your own API key for Claude, OpenAI, Groq, or any OpenAI-compatible API

This wasn't a pivot - it was always the architecture. What we learned from user feedback is that power users who could run both options typically preferred cloud models for summarization quality, while keeping transcription fully local.

We also heard from users who weigh setup effort against quality. Running a local model takes nothing more than a download and keeps everything on-device. Bringing your own API key connects a cloud model like Claude or OpenAI for higher quality on complex meetings. You can switch between local and BYOK anytime.

Some users genuinely cannot use cloud services. Healthcare providers with PHI. Legal teams with privileged communications. Security researchers. For them, the fully local path works well.

For users whose meetings don't contain regulated data, a cloud model with a clear privacy policy is often the right trade-off for better accuracy. The point is: you get to choose.

The Providers We Support (And Their Privacy Policies)

When we decided to add cloud options, we didn't just pick the most popular providers. We read every privacy policy, data processing agreement, and terms of service. Here's what we found:

Anthropic Claude API

Aspect	Policy
Data retention	7-30 days for abuse monitoring
Training on your data	No. API data is NOT used for training
Zero Data Retention	Available for enterprise customers
Our take	Strong privacy stance for a commercial API

Claude consistently produces the highest quality meeting summaries in our testing. Anthropic has been clear: API inputs and outputs are not used for model training. After September 2025, they reduced retention from 30 days to 7 days.

For users who can accept cloud processing, Claude is our recommended option for accuracy.

Source: Anthropic Privacy Center(opens in new tab)

Groq API

Aspect	Policy
Data retention	NOT retained by default
Zero Data Retention	Can enable ZDR in Data Controls settings
Training on your data	No explicit statement, but implies no training
Our take	Privacy-friendly with explicit ZDR option

Groq offers a variety of models (Llama, Mixtral, Gemma, and others) at extremely fast inference speeds via their custom LPU hardware. Their default is to not retain customer data, and you can explicitly enable Zero Data Retention.

For users who want cloud speed without prolonged data storage, Groq is a solid choice.

Source: Groq Privacy Policy(opens in new tab), GroqCloud Data Documentation(opens in new tab)

OpenAI API

Aspect	Policy
Data retention	30 days for abuse monitoring
Training on your data	No - API/business data NOT used unless you opt in
Zero Data Retention	Available for qualifying businesses
Compliance	SOC 2 Type 2 certified, HIPAA BAA available
Our take	Clear policies, strong compliance certifications

OpenAI's API has a well-documented privacy stance: they do not train on API data unless you explicitly opt in. Zero Data Retention is available for enterprise customers.

Source: OpenAI Enterprise Privacy(opens in new tab), OpenAI Business Data(opens in new tab)

Google Gemini API

Aspect	Policy
Data retention	Free tier: 55 days. Paid tier: 30 days
Training on your data	Free tier: YES. Paid/Enterprise: NO
Our take	Be careful with tier - free tier trains on your data

Gemini is where you need to read the fine print. Free tier users have their data retained for 55 days AND used for model training. Paid and enterprise customers get shorter 30-day retention and are excluded from training.

If you use Gemini through Meetily, understand which tier you're on.

Source: Gemini API Terms of Service(opens in new tab)

OpenWebUI (Self-Hosted)

Aspect	Policy
Data retention	You control it - runs on your infrastructure
Training on your data	No - chat history not used for training
Our take	Best privacy option if you can self-host

OpenWebUI is a self-hosted interface for running LLMs. When configured properly, it gives you the best of both worlds: better models than you can run locally, with complete data control.

For privacy-maximizing users who have the technical capability, this is the gold standard.

Source: OpenWebUI(opens in new tab)

Local Ollama Models

Aspect	Policy
Data retention	N/A - runs entirely locally
Training on your data	No - never leaves your device
Our take	Maximum privacy, variable quality

For users who cannot accept any cloud processing, Ollama remains available. We bundle Gemma models for users who don't want to install Ollama separately.

Quality varies by model and hardware. If you have a powerful machine (32GB+ RAM, GPU), Mistral 7B or larger models can produce good results. On modest hardware, expect trade-offs.

Our Current Recommendation

Based on our testing and privacy research:

Scenario	Recommended Provider
Regulated data (HIPAA, legal)	Local Ollama or OpenWebUI self-hosted
Maximum privacy, non-regulated	Groq with ZDR enabled
Best quality, privacy-conscious	Anthropic Claude API
Just want it to work, no API keys	Local Gemma (bundled, no setup)
Enterprise with compliance needs	OpenAI with BAA or Claude Enterprise
Budget-conscious	Local Gemma (bundled) or Groq free tier

What We're Still Figuring Out

This isn't a "we solved it" post. We're still exploring:

Fine-tuning local models - Can we train a smaller model specifically for meeting summarization?
Hybrid approaches - Local processing for sensitive sections, cloud for general summary?
Better prompting - Are there prompt engineering techniques that improve local model output?
Newer models - The local AI landscape evolves rapidly. Today's limitation might be solved in 6 months.

We committed to building a privacy-first meeting assistant. That commitment doesn't mean we pretend local models are perfect when they're not. It means we're honest about trade-offs, transparent about provider policies, and committed to improving.

The Transcription Side: A Brighter Story

While summarization has been challenging, transcription is working well locally. We offer two engines:

Whisper (via whisper-rs)

Models: tiny, base, small, medium, large-v3
Languages: 99+ supported
Quality: High accuracy with large-v3
Hardware: Scales with model size; large-v3 needs ~3GB VRAM

Parakeet

Optimized for: English and French
Quality: Excellent for supported languages
Speed: Faster than equivalent Whisper models
Best for: Users who primarily have English/French meetings

Both run entirely locally. Both work offline after initial model download. For transcription, the local-first approach works.

What About Accuracy Claims?

You'll notice we don't claim specific accuracy percentages like "95% accurate." That's intentional. Accuracy varies dramatically by:

Audio quality
Number of speakers
Accents and dialects
Background noise
Technical jargon

We say "high accuracy" because that's honest. Anyone claiming a universal percentage is oversimplifying.

The Trade-offs Are Real

"Local-first" doesn't automatically mean "best." Sometimes it does. Sometimes it means trading quality for privacy. Here's how we think about it:

For transcription: Local models (Whisper, Parakeet) work excellently. No reason to go cloud.
For summarization: Quality gap is real. Smaller local models hallucinate more. Larger models need serious hardware.
Your choice: Meetily lets you pick - fully local, fully cloud, or mix and match per use case.

What This Means For You

If you're using Meetily:

Community Edition offers fully local processing
Pro Edition adds enhanced-accuracy models, plus cloud model integrations via your own API key
You choose which provider (if any) to use for summarization
Privacy policies above help you evaluate the trade-offs

The architecture was always designed for choice. Use what works for your situation.

Frequently Asked Questions

Yes. Meetily Community Edition runs 100% locally. You can use Ollama or our bundled Gemma models for summarization. No cloud connection required after initial setup. The quality trade-offs we described apply, but for users who need complete data sovereignty, the local path remains available.

For pure privacy, Groq with Zero Data Retention enabled is strong - they don't retain data by default. Anthropic Claude and OpenAI both don't train on API data. Google Gemini's free tier does train on your data, so be careful there. For maximum control, self-hosted OpenWebUI is best.

We're actively tracking this space. Local models improve rapidly. But today, users need working solutions. We'd rather give you options with transparency than make you wait indefinitely or use subpar quality in the meantime.

Only the text transcript - never the audio. You can also choose to only send portions of the transcript. The raw meeting recording always stays on your device.

When you bring your own API key, you pay the provider directly. We don't mark up costs or take a cut. And local models run on your own hardware at no per-summary cost at all.

Key Takeaways

1Local transcription (Whisper/Parakeet) works well - we recommend it for all users
2Local summarization has trade-offs: smaller models hallucinate, larger models need serious hardware
3Bring your own API key for cloud summarization with vetted providers
4Anthropic Claude, Groq, and OpenAI don't train on API data
5Google Gemini free tier DOES train on your data - be aware of your tier
6Self-hosted OpenWebUI gives cloud quality with full data control
7You always choose - fully local remains available in Community Edition

Try It Yourself

Meetily Community Edition is free and open source. Go fully local with a bundled or Ollama model, or bring your own API key for a cloud model. Your call.

Whether you go fully local or connect your own cloud provider, you're in control.

Download Meetily

Privacy-first meeting transcription. Local, or bring your own API key.

Download Free

This post reflects our understanding as of February 2026. Privacy policies change. We'll update this post if providers make significant changes. If you spot anything outdated, email us at meetily@zackriya.com.

About the Author

Sujith @ Zackriya Solutions

Founder of Meetily. Building privacy-first AI tools and being honest about what works and what doesn't. Follow our journey on GitHub.

GitHub Website

Get started

Ready to try Meetily?

Join 308,000+ users who use Meetily for private meeting transcription. No bots, privacy first. Community Edition free.

No meeting bots

100% local transcription

Free & open source

Download Free

Star on GitHub (20K+) · Open source & self-hostable

Get Started with Meetily

Meetily Pro

Advanced features for individuals and teams.

Download

Get Meetily for Mac or Windows. Free and open source.

Download

Star on GitHub (20K+)

Our Quest for Meeting Summary Accuracy: What We Learned Building Meetily

The Promise We Made

What We Heard From Users

Why Local Models Struggle

How Meetily's Pluggable Architecture Helps

The Providers We Support (And Their Privacy Policies)

Anthropic Claude API

Groq API

OpenAI API

Google Gemini API

OpenWebUI (Self-Hosted)

Local Ollama Models

Our Current Recommendation

What We're Still Figuring Out

The Transcription Side: A Brighter Story

Whisper (via whisper-rs)

Parakeet

The Trade-offs Are Real

What This Means For You

Frequently Asked Questions

Key Takeaways

Try It Yourself

Download Meetily

About the Author

Sujith @ Zackriya Solutions

Ready to try Meetily?

Get Started with Meetily

Meetily Pro

Download

Recent Articles

Why Your Meeting Data Belongs on Your Device, Not Someone Else's Server

Meetily v0.3.0: Import Audio Files, Retranscribe, and What's Coming Next

10,000 GitHub Stars: Thank You for Believing in Privacy-First AI