Pluggable Transcription
Choose your transcription engine - local Whisper, Parakeet, self-hosted servers, or cloud APIs like Groq and OpenAI. One settings page, full control.
Last updated: March 10, 2026
On this page
TL;DR
Meetily can turn your meetings into text using different transcription engines. You pick which engine to use, and Meetily handles the rest. Local options (Whisper, Parakeet) keep audio on your machine. Cloud options (Groq, OpenAI) offer speed. Self-hosted servers (OpenAI-Compatible) give you full control. Everything is configured in one place: Settings > Transcript.
Pluggable Transcription
Meetily can turn your meetings into text using different transcription engines. You pick which engine to use, and Meetily handles the rest.
Think of it like choosing a printer - some are built into your computer (local), some are on the network (cloud). You pick one, configure it once, and every meeting uses it until you change your mind.
Everything is configured in one place: Settings > Transcript.
Quick Start: Which Provider Should I Use?
Not sure which one to pick? Here's a simple decision tree:
Are you on a Mac?
→ Yes → Use Whisper (it's pre-selected and marked "Recommended")
→ No, Windows → Use Parakeet (also marked "Recommended")
Want faster transcription without using your computer's resources?
→ Yes → Use Groq (free tier available, very fast)
Running your own transcription server?
→ Yes → Use OpenAI-Compatible
Need the highest possible accuracy and don't mind paying?
→ Yes → Use OpenAI
Available Providers
When you open the provider dropdown, you'll see two groups:
On-Device
These run on your computer. No internet needed after the initial model download. Your audio never leaves your machine.
| Provider | Best for | Example use case |
|---|---|---|
| Whisper | macOS users | "I want accurate transcription running locally on my MacBook Pro. I don't want my meeting audio sent anywhere." |
| Parakeet | Windows users | "I have a Windows PC with an NVIDIA GPU and want fast local transcription." |
| OpenAI-Compatible | Self-hosted servers | "I run a Speaches server on my home lab / office server and want Meetily to use it." |
Recommended Badges
On macOS, Whisper shows a small amber "Recommended" badge. On Windows, Parakeet gets the badge instead.
Cloud
These send your audio to external servers for transcription. Faster than local - but requires internet and an API key.
| Provider | Best for | Example use case |
|---|---|---|
| Groq | Speed-focused users | "I want the fastest possible transcription. Groq processes audio in seconds." |
| OpenAI | Accuracy-focused users | "I want OpenAI's official Whisper API for maximum accuracy." |
More cloud providers (Together AI, Fireworks AI) are coming soon.
How to Switch Providers
Step-by-step walkthrough
- Click the gear icon to open Settings, then select the Transcript tab
- Find the Transcription Provider dropdown at the top
- Click it - you'll see two groups: On-Device and Cloud
- Click a provider name to select it
- A green toast notification confirms: "Switched to [provider name]"
- New configuration fields appear below the dropdown based on your choice:
- Whisper → model manager (download/select models)
- Parakeet → model download prompt
- Groq/OpenAI → API key field, model selector, test connection button
- OpenAI-Compatible → server URL field, optional API key, model selector, test connection button
What gets saved, and when
| Action | When it saves | Example |
|---|---|---|
| Selecting a provider | Immediately | You click "Groq" → saved right away |
| Selecting a model | Immediately | You pick "whisper-large-v3" → saved right away |
| Typing an API key | When you click away from the field | You paste your key, then click somewhere else → saved |
| Typing a server URL | When you click away from the field | You type http://localhost:8000, then click elsewhere → saved |
Switching back restores everything
Example: You set up Groq with your API key and whisper-large-v3 model. Then you switch to Whisper for a week. When you switch back to Groq, your API key and model choice are still there - nothing lost.
Can't switch during a recording
The provider dropdown is grayed out while recording. If you click it, a toast message appears: "Transcription provider cannot be changed while recording is in progress." Same for the model dropdown. Finish your recording first, then switch.
Setting Up Each Provider
Whisper (Local)
The simplest option - everything runs on your machine with zero configuration.
Steps:
- Select Whisper from the provider dropdown
- The model manager appears below, showing available models:
| Model | Speed | Accuracy | Download size | Best for |
|---|---|---|---|---|
tiny | Fastest | Lower | ~75 MB | Quick testing, low-powered machines |
base | Fast | Moderate | ~142 MB | Casual meetings, development |
small | Medium | Good | ~466 MB | Daily use, most meetings |
medium | Slower | Better | ~1.5 GB | Important meetings, professional use |
large-v3 | Slowest | Best | ~3 GB | Maximum accuracy, presentations |
- Click the download button next to a model. A progress bar shows the download.
- Once downloaded, select the model - it's now your active transcription engine.
- Start recording. Text appears in real-time.
Example scenario: You download small (466 MB, takes ~2 minutes). You start recording a team standup. Words appear on screen as people speak. Everything runs on your Mac's M1 chip - your audio never touches the internet.
Privacy note at the bottom of settings confirms: "Audio is processed on your local server. No data leaves your machine."
Adding Custom Models from HuggingFace
The built-in models cover most needs, but if you want a specialized or community-tuned Whisper model, you can download one from HuggingFace.
Example: You found a Whisper model fine-tuned for medical terminology on HuggingFace and want to use it for transcribing doctor-patient consultations.
Steps:
- In the Whisper model manager, click Add from HuggingFace
- Paste a HuggingFace repository URL into the input field
- Example:
https://huggingface.co/ggerganov/whisper.cpp - You can also paste just the repo path:
ggerganov/whisper.cpp
- Example:
- Meetily scans the repository and shows a file picker listing all compatible model files:
- Each file shows its name, size (e.g., "463 MB"), and a download icon
- Click the download icon next to the model you want
- A progress bar appears:
- Shows percentage complete and MB downloaded (e.g., "234 MB / 463 MB")
- A cancel button lets you stop the download at any time - partial files are cleaned up automatically
- Once complete, the model appears in your model list with:
- An "HF" badge to distinguish it from built-in models
- The model name, file size, and source URL
- A green "Ready" indicator
- Hover over the model card to reveal a delete button if you want to remove it later
If the repository is private or gated: Meetily shows an amber banner: "This model requires authentication." Enter your HuggingFace token (get one at huggingface.co/settings/tokens(opens in new tab)).
What if I pick the wrong file type? Meetily validates models automatically:
- GGML files (
.bin,.ggml) - accepted - GGUF files - rejected with message: "No GGML model files found. Note: GGUF models are not compatible with whisper.cpp."
- F32 (unquantized) models - rejected (too large for real-time use, detected within the first few seconds of download)
- PyTorch
.binfiles - rejected (wrong format, detected by file header)
Parakeet (Local)
- Select Parakeet from the provider dropdown
- Download the Parakeet model when prompted
- Start recording
Best for Windows machines with NVIDIA GPUs. Like Whisper, everything runs locally and your audio stays on your machine.
Example scenario: You have a Windows desktop with an RTX 4070. Parakeet uses NVIDIA's optimized engine for fast, accurate transcription without any cloud dependency.
OpenAI-Compatible (Self-Hosted Servers)
Use this when you run your own transcription server - on your computer, your home lab, or your office network.
Example scenario: You run Speaches(opens in new tab) on a spare Linux box with a GPU. You want Meetily on your Mac to send audio to that server for transcription.
Steps:
-
Select OpenAI Compatible from the provider dropdown
-
Enter your Server URL - this is the address where your server is running.
- Examples:
http://localhost:8000,http://192.168.1.50:8000,http://my-server.local:8080 - Tip: Click the info icon (i) next to the URL field. A helper panel pops up with two columns:
Local servers Cloud providers Speaches → http://localhost:8000OpenAI → https://api.openai.comwhisper.cpp → http://localhost:8080Groq → https://api.groq.com/openaiFaster Whisper Server → http://localhost:8000Together AI → https://api.together.xyzvLLM → http://localhost:8000Fireworks AI → https://api.fireworks.aiEach URL is clickable - click one to auto-fill the field. There's also a copy button next to each URL. Clicking a URL from this panel immediately saves it and fetches available models.
- Examples:
-
Enter an API Key (optional)
- Most local servers (Speaches, whisper.cpp) don't need one - leave it blank
- If your server uses authentication, paste the key here
-
Wait for models to load - after entering the URL and clicking away, Meetily fetches available models from your server
- A spinning icon appears next to the "Model" label while loading
- If no URL is entered yet, you'll see a disabled dropdown with: "Enter server URL and API key to load models."
-
Select a model from the dropdown
- Models from your server appear automatically
- If your model isn't listed, type its name - a "Use [your-model-name]" option appears
-
Click Test Connection to verify everything works
What the URL field border tells you:
- Green border + checkmark = your server is reachable and the model exists
- Red border + X icon = something's wrong (see Troubleshooting below)
Privacy note at the bottom confirms: "Audio is processed on your local server. No data leaves your machine." (when using a localhost or local network URL)
Groq / OpenAI (Cloud)
Example scenario (Groq): You want the fastest possible transcription and don't mind sending audio to the cloud. You signed up at groq.com and got a free API key.
Example scenario (OpenAI): You already pay for OpenAI's API and want to use their Whisper model for maximum accuracy.
Steps:
-
Select the provider (e.g., Groq) from the dropdown
-
Enter your API Key
- The field shows dots (like a password) - your key is hidden by default
- Click the eye icon on the right to reveal the key, click again to hide it
- After you type/paste your key and click away from the field, two things happen:
- The key is saved
- The field locks automatically - a lock icon appears, and the field becomes read-only
- This lock prevents you from accidentally editing or deleting your key
To edit a locked key:
- Click the lock icon to unlock the field
- If you click the text field itself (instead of the lock icon), the lock icon shakes in red to show you where to click
Before you enter a key, you'll see a disabled model dropdown with: "Enter API key to load available models."
-
Models load automatically after you save your API key
- A spinning icon appears next to "Model" while fetching from the provider's API
- Once loaded, the model dropdown activates
-
Select a model from the dropdown. Models are organized into groups:
Group What it means Visual indicator Verified Curated models tested to work well. Best choice for most users. Blue checkmark badge next to the name Others Additional models from the API. May work, but not tested by us. No badge Custom Appears when you type a name that doesn't match any listed model. Shows as "Use [name]" at the bottom Example for Groq: The "Verified" group shows
whisper-large-v3andwhisper-large-v3-turbowith blue badges. Pick one.Example for a custom model: You type
my-fine-tuned-whisperin the search box. Since it doesn't match any listed model, a "Use my-fine-tuned-whisper" option appears at the bottom. Click it. -
Click Test Connection to verify
Privacy Note
When using cloud providers, your audio is sent to external servers for transcription. The privacy note at the bottom of settings updates to reflect this: "Audio is sent to [provider] servers for transcription. API key stored locally only."
Connection Testing
For any provider with a server URL or API key (OpenAI-Compatible, Groq, OpenAI), always test your connection before your first recording.
What the test checks
- Can Meetily reach the server? It connects to the provider's
/v1/modelsendpoint. - Is your model available? It checks if your selected model exists on the server.
- How fast is it? It measures and reports latency.
Test results you might see
| Result | What it means | What to do |
|---|---|---|
| Green border + "Connected (42ms)" | Everything works. You're ready to record. | Nothing - you're good! |
| "Connected successfully but model 'X' not found in 5 available models" | Server is reachable, but doesn't have your model. | Pick a different model from the dropdown. |
| Red border + "Connection failed" | Can't reach the server at all. | Check the URL, make sure the server is running. |
| Grayed-out button + "Enter API key first" | Can't test without credentials. | Enter your API key first. |
Troubleshooting checklist
If the test fails, go through this list:
- Check the URL - is there a typo? Missing port number? (
http://localhost:8000nothttp://localhost) - Is the server running? - try opening the URL in your browser. You should see a response (even if it's an error page).
- Is the API key correct? - try the key in the provider's own dashboard/playground first.
- Firewall or VPN? - local servers might be blocked by firewall rules. Cloud providers might be blocked by a corporate VPN.
- Wrong model? - the model name must exactly match what the server offers. Use the models that auto-populate rather than typing manually.
How It Works During Recording
When you press the record button, here's what happens behind the scenes:
Local providers (Whisper, Parakeet)
Your microphone → Meetily captures audio
→ Voice Activity Detection filters out silence
→ Only speech is sent to the Whisper/Parakeet engine on your machine
→ Text appears on screen in real-time
- No internet used - everything happens on your CPU/GPU
- Silence is skipped - VAD (Voice Activity Detection) filters out pauses, keyboard typing, background noise. This makes transcription ~70% more efficient.
- Real-time results - words appear as each audio chunk (~5 seconds) is processed
External providers (OpenAI-Compatible, Groq, OpenAI)
Your microphone → Meetily captures audio
→ Audio is split into chunks
→ Each chunk is sent as a WAV file to the provider's API
→ Provider returns text with timestamps
→ Text appears on screen
- Prompt chaining keeps context - Meetily sends the last 50 words from the previous chunk as context for the next one. This means if someone says "...and the quarterly results show that-" at the end of one chunk and "-we exceeded our targets" at the start of the next, the transcription stays coherent.
- Automatic retry - if a chunk fails (network blip, server error), Meetily retries with a simpler response format.
- Your audio recording is always saved locally - even if the cloud provider has issues, you never lose the recording itself.
Where Settings Are Stored
All settings are stored locally on your machine. Nothing is sent to our servers.
| Setting | Where | Details |
|---|---|---|
| Selected provider | App database | Which provider is active (e.g., "groq") |
| Selected model | App database | Which model to use (e.g., "whisper-large-v3") |
| API key | App database + config file | Your secret key, stored per-provider |
| Server URL | Config file + browser storage | Saved per-provider, restored on switch |
Key point: Each provider's credentials are completely isolated. Your Groq API key is never mixed with your OpenAI settings. When you switch between providers, each one loads its own saved configuration.
Real-World Examples
Example 1: Daily standups on a Mac
"I run a 15-minute standup every morning. I want fast, private transcription."
Setup: Whisper → small model → Start recording. Done. Audio stays on your Mac, transcription takes ~2 seconds per chunk on an M1.
Example 2: Long client calls with cloud speed
"I have hour-long client calls and want transcription without loading my laptop."
Setup: Groq → paste API key → select whisper-large-v3 → Test Connection → Start recording. Groq processes audio in the cloud, your laptop stays cool.
Example 3: Self-hosted for privacy compliance
"Our company policy says no audio can leave our network. I run Speaches on an internal server."
Setup: OpenAI-Compatible → Server URL: http://transcription-server.internal:8000 → Models load automatically → select model → Test Connection → Start recording. Audio goes to your internal server, never to the internet.
Example 4: Specialized medical transcription
"I found a Whisper model fine-tuned for medical terminology on HuggingFace."
Setup: Whisper → Add from HuggingFace → paste repo URL → pick the GGML model file → download → select it → Start recording. The custom model handles medical terms better than the default models.
Screenshots Guide
These are the key screens worth capturing for visual reference:
| # | What to capture | Why it helps |
|---|---|---|
| 1 | Provider dropdown open - showing On-Device group (Whisper with "Recommended" badge, Parakeet, OpenAI-Compatible) and Cloud group (Groq, OpenAI) | Shows the first thing users see |
| 2 | Groq fully configured - API key locked, model selected with blue verified badge, privacy note visible | Shows what a "ready" cloud setup looks like |
| 3 | Server URL info popover - the two-column panel with clickable URLs and copy buttons | Users need to know this helper exists |
| 4 | Model dropdown open - showing Verified group (blue badges), Others group, and a Custom entry | Explains the model organization |
| 5 | API key locked vs unlocked - side by side showing the lock icon and the red shake animation | Common point of confusion |
| 6 | Empty state before API key - disabled model dropdown showing "Enter API key to load available models" | Shows what new users see first |
| 7 | Connection test success - green URL border, checkmark, "Connected (42ms)" | Shows what success looks like |
| 8 | Connection test failure - red border, X icon, error message | Shows what to look for when debugging |
| 9 | HuggingFace file picker - list of model files with sizes and download buttons | The HF flow is multi-step |
| 10 | HuggingFace download in progress - progress bar, percentage, cancel button | Users need to know they can cancel |
| 11 | Privacy note variants - "No data leaves your machine" vs "Audio is sent to Groq servers" | Builds trust |
| 12 | Recording-locked state - grayed out dropdown with toast message | Prevents confusion during recording |