Pluggable Transcription

new

Choose your transcription engine - local Whisper, Parakeet, self-hosted servers, or cloud APIs like Groq and OpenAI. One settings page, full control.

Last updated: March 10, 2026

On this page

TL;DR

Meetily can turn your meetings into text using different transcription engines. You pick which engine to use, and Meetily handles the rest. Local options (Whisper, Parakeet) keep audio on your machine. Cloud options (Groq, OpenAI) offer speed. Self-hosted servers (OpenAI-Compatible) give you full control. Everything is configured in one place: Settings > Transcript.

Pluggable Transcription

Meetily can turn your meetings into text using different transcription engines. You pick which engine to use, and Meetily handles the rest.

Think of it like choosing a printer - some are built into your computer (local), some are on the network (cloud). You pick one, configure it once, and every meeting uses it until you change your mind.

Everything is configured in one place: Settings > Transcript.

Quick Start: Which Provider Should I Use?

Not sure which one to pick? Here's a simple decision tree:

Are you on a Mac?
  → Yes → Use Whisper (it's pre-selected and marked "Recommended")
  → No, Windows → Use Parakeet (also marked "Recommended")

Want faster transcription without using your computer's resources?
  → Yes → Use Groq (free tier available, very fast)

Running your own transcription server?
  → Yes → Use OpenAI-Compatible

Need the highest possible accuracy and don't mind paying?
  → Yes → Use OpenAI

Available Providers

When you open the provider dropdown, you'll see two groups:

On-Device

These run on your computer. No internet needed after the initial model download. Your audio never leaves your machine.

Provider	Best for	Example use case
Whisper	macOS users	"I want accurate transcription running locally on my MacBook Pro. I don't want my meeting audio sent anywhere."
Parakeet	Windows users	"I have a Windows PC with an NVIDIA GPU and want fast local transcription."
OpenAI-Compatible	Self-hosted servers	"I run a Speaches server on my home lab / office server and want Meetily to use it."

Recommended Badges

On macOS, Whisper shows a small amber "Recommended" badge. On Windows, Parakeet gets the badge instead.

Cloud

These send your audio to external servers for transcription. Faster than local - but requires internet and an API key.

Provider	Best for	Example use case
Groq	Speed-focused users	"I want the fastest possible transcription. Groq processes audio in seconds."
OpenAI	Accuracy-focused users	"I want OpenAI's official Whisper API for maximum accuracy."

More cloud providers (Together AI, Fireworks AI) are coming soon.

How to Switch Providers

Step-by-step walkthrough

Click the gear icon to open Settings, then select the Transcript tab
Find the Transcription Provider dropdown at the top
Click it - you'll see two groups: On-Device and Cloud
Click a provider name to select it
A green toast notification confirms: "Switched to [provider name]"
New configuration fields appear below the dropdown based on your choice:
- Whisper → model manager (download/select models)
- Parakeet → model download prompt
- Groq/OpenAI → API key field, model selector, test connection button
- OpenAI-Compatible → server URL field, optional API key, model selector, test connection button

What gets saved, and when

Action	When it saves	Example
Selecting a provider	Immediately	You click "Groq" → saved right away
Selecting a model	Immediately	You pick "whisper-large-v3" → saved right away
Typing an API key	When you click away from the field	You paste your key, then click somewhere else → saved
Typing a server URL	When you click away from the field	You type `http://localhost:8000`, then click elsewhere → saved

Switching back restores everything

Example: You set up Groq with your API key and whisper-large-v3 model. Then you switch to Whisper for a week. When you switch back to Groq, your API key and model choice are still there - nothing lost.

Can't switch during a recording

The provider dropdown is grayed out while recording. If you click it, a toast message appears: "Transcription provider cannot be changed while recording is in progress." Same for the model dropdown. Finish your recording first, then switch.

Setting Up Each Provider

Whisper (Local)

The simplest option - everything runs on your machine with zero configuration.

Steps:

Select Whisper from the provider dropdown
The model manager appears below, showing available models:

Model	Speed	Accuracy	Download size	Best for
`tiny`	Fastest	Lower	~75 MB	Quick testing, low-powered machines
`base`	Fast	Moderate	~142 MB	Casual meetings, development
`small`	Medium	Good	~466 MB	Daily use, most meetings
`medium`	Slower	Better	~1.5 GB	Important meetings, professional use
`large-v3`	Slowest	Best	~3 GB	Maximum accuracy, presentations

Click the download button next to a model. A progress bar shows the download.
Once downloaded, select the model - it's now your active transcription engine.
Start recording. Text appears in real-time.

Example scenario: You download small (466 MB, takes ~2 minutes). You start recording a team standup. Words appear on screen as people speak. Everything runs on your Mac's M1 chip - your audio never touches the internet.

Privacy note at the bottom of settings confirms: "Audio is processed on your local server. No data leaves your machine."

Adding Custom Models from HuggingFace

The built-in models cover most needs, but if you want a specialized or community-tuned Whisper model, you can download one from HuggingFace.

Example: You found a Whisper model fine-tuned for medical terminology on HuggingFace and want to use it for transcribing doctor-patient consultations.

Steps:

In the Whisper model manager, click Add from HuggingFace
Paste a HuggingFace repository URL into the input field
- Example: https://huggingface.co/ggerganov/whisper.cpp
- You can also paste just the repo path: ggerganov/whisper.cpp
Meetily scans the repository and shows a file picker listing all compatible model files:
- Each file shows its name, size (e.g., "463 MB"), and a download icon
Click the download icon next to the model you want
A progress bar appears:
- Shows percentage complete and MB downloaded (e.g., "234 MB / 463 MB")
- A cancel button lets you stop the download at any time - partial files are cleaned up automatically
Once complete, the model appears in your model list with:
- An "HF" badge to distinguish it from built-in models
- The model name, file size, and source URL
- A green "Ready" indicator
- Hover over the model card to reveal a delete button if you want to remove it later

If the repository is private or gated: Meetily shows an amber banner: "This model requires authentication." Enter your HuggingFace token (get one at huggingface.co/settings/tokens(opens in new tab)).

What if I pick the wrong file type? Meetily validates models automatically:

GGML files (.bin, .ggml) - accepted
GGUF files - rejected with message: "No GGML model files found. Note: GGUF models are not compatible with whisper.cpp."
F32 (unquantized) models - rejected (too large for real-time use, detected within the first few seconds of download)
PyTorch .bin files - rejected (wrong format, detected by file header)

Parakeet (Local)

Select Parakeet from the provider dropdown
Download the Parakeet model when prompted
Start recording

Best for Windows machines with NVIDIA GPUs. Like Whisper, everything runs locally and your audio stays on your machine.

Example scenario: You have a Windows desktop with an RTX 4070. Parakeet uses NVIDIA's optimized engine for fast, accurate transcription without any cloud dependency.

OpenAI-Compatible (Self-Hosted Servers)

Use this when you run your own transcription server - on your computer, your home lab, or your office network.

Example scenario: You run Speaches(opens in new tab) on a spare Linux box with a GPU. You want Meetily on your Mac to send audio to that server for transcription.

Steps:

Select OpenAI Compatible from the provider dropdown

Enter your Server URL - this is the address where your server is running.

Examples: http://localhost:8000, http://192.168.1.50:8000, http://my-server.local:8080
Tip: Click the info icon (i) next to the URL field. A helper panel pops up with two columns:

Local servers	Cloud providers
Speaches → `http://localhost:8000`	OpenAI → `https://api.openai.com`
whisper.cpp → `http://localhost:8080`	Groq → `https://api.groq.com/openai`
Faster Whisper Server → `http://localhost:8000`	Together AI → `https://api.together.xyz`
vLLM → `http://localhost:8000`	Fireworks AI → `https://api.fireworks.ai`

Each URL is clickable - click one to auto-fill the field. There's also a copy button next to each URL. Clicking a URL from this panel immediately saves it and fetches available models.

Enter an API Key (optional)
- Most local servers (Speaches, whisper.cpp) don't need one - leave it blank
- If your server uses authentication, paste the key here
Wait for models to load - after entering the URL and clicking away, Meetily fetches available models from your server
- A spinning icon appears next to the "Model" label while loading
- If no URL is entered yet, you'll see a disabled dropdown with: "Enter server URL and API key to load models."
Select a model from the dropdown
- Models from your server appear automatically
- If your model isn't listed, type its name - a "Use [your-model-name]" option appears
Click Test Connection to verify everything works

What the URL field border tells you:

Green border + checkmark = your server is reachable and the model exists
Red border + X icon = something's wrong (see Troubleshooting below)

Privacy note at the bottom confirms: "Audio is processed on your local server. No data leaves your machine." (when using a localhost or local network URL)

Groq / OpenAI (Cloud)

Example scenario (Groq): You want the fastest possible transcription and don't mind sending audio to the cloud. You signed up at groq.com and got a free API key.

Example scenario (OpenAI): You already pay for OpenAI's API and want to use their Whisper model for maximum accuracy.

Steps:

Select the provider (e.g., Groq) from the dropdown
Enter your API Key
- The field shows dots (like a password) - your key is hidden by default
- Click the eye icon on the right to reveal the key, click again to hide it
- After you type/paste your key and click away from the field, two things happen:
  1. The key is saved
  2. The field locks automatically - a lock icon appears, and the field becomes read-only
- This lock prevents you from accidentally editing or deleting your key
To edit a locked key:
- Click the lock icon to unlock the field
- If you click the text field itself (instead of the lock icon), the lock icon shakes in red to show you where to click
Before you enter a key, you'll see a disabled model dropdown with: "Enter API key to load available models."
Models load automatically after you save your API key
- A spinning icon appears next to "Model" while fetching from the provider's API
- Once loaded, the model dropdown activates

Select a model from the dropdown. Models are organized into groups:

Group	What it means	Visual indicator
Verified	Curated models tested to work well. Best choice for most users.	Blue checkmark badge next to the name
Others	Additional models from the API. May work, but not tested by us.	No badge
Custom	Appears when you type a name that doesn't match any listed model.	Shows as "Use [name]" at the bottom

Example for Groq: The "Verified" group shows whisper-large-v3 and whisper-large-v3-turbo with blue badges. Pick one.

Example for a custom model: You type my-fine-tuned-whisper in the search box. Since it doesn't match any listed model, a "Use my-fine-tuned-whisper" option appears at the bottom. Click it.

Click Test Connection to verify

Privacy Note

When using cloud providers, your audio is sent to external servers for transcription. The privacy note at the bottom of settings updates to reflect this: "Audio is sent to [provider] servers for transcription. API key stored locally only."

Connection Testing

For any provider with a server URL or API key (OpenAI-Compatible, Groq, OpenAI), always test your connection before your first recording.

What the test checks

Can Meetily reach the server? It connects to the provider's /v1/models endpoint.
Is your model available? It checks if your selected model exists on the server.
How fast is it? It measures and reports latency.

Test results you might see

Result	What it means	What to do
Green border + "Connected (42ms)"	Everything works. You're ready to record.	Nothing - you're good!
"Connected successfully but model 'X' not found in 5 available models"	Server is reachable, but doesn't have your model.	Pick a different model from the dropdown.
Red border + "Connection failed"	Can't reach the server at all.	Check the URL, make sure the server is running.
Grayed-out button + "Enter API key first"	Can't test without credentials.	Enter your API key first.

Troubleshooting checklist

If the test fails, go through this list:

Check the URL - is there a typo? Missing port number? (http://localhost:8000 not http://localhost)
Is the server running? - try opening the URL in your browser. You should see a response (even if it's an error page).
Is the API key correct? - try the key in the provider's own dashboard/playground first.
Firewall or VPN? - local servers might be blocked by firewall rules. Cloud providers might be blocked by a corporate VPN.
Wrong model? - the model name must exactly match what the server offers. Use the models that auto-populate rather than typing manually.

How It Works During Recording

When you press the record button, here's what happens behind the scenes:

Local providers (Whisper, Parakeet)

Your microphone → Meetily captures audio
                → Voice Activity Detection filters out silence
                → Only speech is sent to the Whisper/Parakeet engine on your machine
                → Text appears on screen in real-time

No internet used - everything happens on your CPU/GPU
Silence is skipped - VAD (Voice Activity Detection) filters out pauses, keyboard typing, background noise. This makes transcription ~70% more efficient.
Real-time results - words appear as each audio chunk (~5 seconds) is processed

External providers (OpenAI-Compatible, Groq, OpenAI)

Your microphone → Meetily captures audio
                → Audio is split into chunks
                → Each chunk is sent as a WAV file to the provider's API
                → Provider returns text with timestamps
                → Text appears on screen

Prompt chaining keeps context - Meetily sends the last 50 words from the previous chunk as context for the next one. This means if someone says "...and the quarterly results show that-" at the end of one chunk and "-we exceeded our targets" at the start of the next, the transcription stays coherent.
Automatic retry - if a chunk fails (network blip, server error), Meetily retries with a simpler response format.
Your audio recording is always saved locally - even if the cloud provider has issues, you never lose the recording itself.

Where Settings Are Stored

All settings are stored locally on your machine. Nothing is sent to our servers.

Setting	Where	Details
Selected provider	App database	Which provider is active (e.g., "groq")
Selected model	App database	Which model to use (e.g., "whisper-large-v3")
API key	App database + config file	Your secret key, stored per-provider
Server URL	Config file + browser storage	Saved per-provider, restored on switch

Key point: Each provider's credentials are completely isolated. Your Groq API key is never mixed with your OpenAI settings. When you switch between providers, each one loads its own saved configuration.

Real-World Examples

Example 1: Daily standups on a Mac

"I run a 15-minute standup every morning. I want fast, private transcription."

Setup: Whisper → small model → Start recording. Done. Audio stays on your Mac, transcription takes ~2 seconds per chunk on an M1.

Example 2: Long client calls with cloud speed

"I have hour-long client calls and want transcription without loading my laptop."

Setup: Groq → paste API key → select whisper-large-v3 → Test Connection → Start recording. Groq processes audio in the cloud, your laptop stays cool.

Example 3: Self-hosted for privacy compliance

"Our company policy says no audio can leave our network. I run Speaches on an internal server."

Setup: OpenAI-Compatible → Server URL: http://transcription-server.internal:8000 → Models load automatically → select model → Test Connection → Start recording. Audio goes to your internal server, never to the internet.

Example 4: Specialized medical transcription

"I found a Whisper model fine-tuned for medical terminology on HuggingFace."

Setup: Whisper → Add from HuggingFace → paste repo URL → pick the GGML model file → download → select it → Start recording. The custom model handles medical terms better than the default models.

Screenshots Guide

These are the key screens worth capturing for visual reference:

#	What to capture	Why it helps
1	Provider dropdown open - showing On-Device group (Whisper with "Recommended" badge, Parakeet, OpenAI-Compatible) and Cloud group (Groq, OpenAI)	Shows the first thing users see
2	Groq fully configured - API key locked, model selected with blue verified badge, privacy note visible	Shows what a "ready" cloud setup looks like
3	Server URL info popover - the two-column panel with clickable URLs and copy buttons	Users need to know this helper exists
4	Model dropdown open - showing Verified group (blue badges), Others group, and a Custom entry	Explains the model organization
5	API key locked vs unlocked - side by side showing the lock icon and the red shake animation	Common point of confusion
6	Empty state before API key - disabled model dropdown showing "Enter API key to load available models"	Shows what new users see first
7	Connection test success - green URL border, checkmark, "Connected (42ms)"	Shows what success looks like
8	Connection test failure - red border, X icon, error message	Shows what to look for when debugging
9	HuggingFace file picker - list of model files with sizes and download buttons	The HF flow is multi-step
10	HuggingFace download in progress - progress bar, percentage, cancel button	Users need to know they can cancel
11	Privacy note variants - "No data leaves your machine" vs "Audio is sent to Groq servers"	Builds trust
12	Recording-locked state - grayed out dropdown with toast message	Prevents confusion during recording

Frequently Asked Questions

If you're on a Mac, start with Whisper (it's the default and marked 'Recommended'). On Windows, start with Parakeet. If you want speed without using your computer's resources, try Groq (free tier available). See the Quick Start section above for a full decision guide.

No. One provider is active at a time. All recordings use the selected provider until you change it.

Only to download the model the first time. After that, everything runs fully offline.

No. Each provider's settings (API key, server URL, model) are saved independently. Switch away and back - everything is restored exactly as you left it.

It indicates the best provider for your operating system. Whisper is recommended on macOS (uses Apple's Metal GPU), Parakeet is recommended on Windows (uses NVIDIA's optimized engine).

OpenAI is OpenAI's official cloud API at api.openai.com. Requires an API key, sends audio to OpenAI's servers. OpenAI-Compatible is any server that speaks the same protocol - could be a server on your own computer (free, no API key, audio stays local), a server on your office network, or a third-party service. Think of it this way: 'OpenAI' is a specific restaurant. 'OpenAI-Compatible' is any restaurant that serves the same menu.

Yes. Type the model name in the search box. A 'Use [your-model-name]' option appears at the bottom - click it. This is useful for custom fine-tuned models or new models your server supports.

Yes. API keys are stored locally on your machine in the app's database. They are never sent to Meetily's servers, only sent to the specific provider you configured, hidden (masked) by default in the UI, and locked after entry to prevent accidental edits.

Switching engines mid-recording would cause gaps or inconsistencies in the transcript. Finish your current recording first, then switch.

Meetily retries failed chunks with a simpler response format. If the provider stays unreachable, some text chunks may be lost - but your audio recording is always saved locally regardless. You can re-transcribe the audio later.

The server is reachable but doesn't have the model you selected. Open the model dropdown and pick one of the models that auto-populated from the server. Those are the ones actually available.

Click the lock icon (not the text field). The lock icon is on the right side of the input. Clicking the text field itself just makes the lock shake in red - that's the app telling you to click the lock instead.

Pluggable Transcription

Frequently Asked Questions

Ready to get started?