Pluggable Transcription

new

Choose your transcription engine - local Whisper, Parakeet, self-hosted servers, or cloud APIs like Groq and OpenAI. One settings page, full control.

Last updated: March 10, 2026

TL;DR

Meetily can turn your meetings into text using different transcription engines. You pick which engine to use, and Meetily handles the rest. Local options (Whisper, Parakeet) keep audio on your machine. Cloud options (Groq, OpenAI) offer speed. Self-hosted servers (OpenAI-Compatible) give you full control. Everything is configured in one place: Settings > Transcript.

Pluggable Transcription

Meetily can turn your meetings into text using different transcription engines. You pick which engine to use, and Meetily handles the rest.

Think of it like choosing a printer - some are built into your computer (local), some are on the network (cloud). You pick one, configure it once, and every meeting uses it until you change your mind.

Everything is configured in one place: Settings > Transcript.


Quick Start: Which Provider Should I Use?

Not sure which one to pick? Here's a simple decision tree:

Are you on a Mac?
  → Yes → Use Whisper (it's pre-selected and marked "Recommended")
  → No, Windows → Use Parakeet (also marked "Recommended")

Want faster transcription without using your computer's resources?
  → Yes → Use Groq (free tier available, very fast)

Running your own transcription server?
  → Yes → Use OpenAI-Compatible

Need the highest possible accuracy and don't mind paying?
  → Yes → Use OpenAI

Available Providers

When you open the provider dropdown, you'll see two groups:

On-Device

These run on your computer. No internet needed after the initial model download. Your audio never leaves your machine.

ProviderBest forExample use case
WhispermacOS users"I want accurate transcription running locally on my MacBook Pro. I don't want my meeting audio sent anywhere."
ParakeetWindows users"I have a Windows PC with an NVIDIA GPU and want fast local transcription."
OpenAI-CompatibleSelf-hosted servers"I run a Speaches server on my home lab / office server and want Meetily to use it."

Recommended Badges

On macOS, Whisper shows a small amber "Recommended" badge. On Windows, Parakeet gets the badge instead.

Cloud

These send your audio to external servers for transcription. Faster than local - but requires internet and an API key.

ProviderBest forExample use case
GroqSpeed-focused users"I want the fastest possible transcription. Groq processes audio in seconds."
OpenAIAccuracy-focused users"I want OpenAI's official Whisper API for maximum accuracy."

More cloud providers (Together AI, Fireworks AI) are coming soon.


How to Switch Providers

Step-by-step walkthrough

  1. Click the gear icon to open Settings, then select the Transcript tab
  2. Find the Transcription Provider dropdown at the top
  3. Click it - you'll see two groups: On-Device and Cloud
  4. Click a provider name to select it
  5. A green toast notification confirms: "Switched to [provider name]"
  6. New configuration fields appear below the dropdown based on your choice:
    • Whisper → model manager (download/select models)
    • Parakeet → model download prompt
    • Groq/OpenAI → API key field, model selector, test connection button
    • OpenAI-Compatible → server URL field, optional API key, model selector, test connection button

What gets saved, and when

ActionWhen it savesExample
Selecting a providerImmediatelyYou click "Groq" → saved right away
Selecting a modelImmediatelyYou pick "whisper-large-v3" → saved right away
Typing an API keyWhen you click away from the fieldYou paste your key, then click somewhere else → saved
Typing a server URLWhen you click away from the fieldYou type http://localhost:8000, then click elsewhere → saved

Switching back restores everything

Example: You set up Groq with your API key and whisper-large-v3 model. Then you switch to Whisper for a week. When you switch back to Groq, your API key and model choice are still there - nothing lost.

Can't switch during a recording

The provider dropdown is grayed out while recording. If you click it, a toast message appears: "Transcription provider cannot be changed while recording is in progress." Same for the model dropdown. Finish your recording first, then switch.


Setting Up Each Provider

Whisper (Local)

The simplest option - everything runs on your machine with zero configuration.

Steps:

  1. Select Whisper from the provider dropdown
  2. The model manager appears below, showing available models:
ModelSpeedAccuracyDownload sizeBest for
tinyFastestLower~75 MBQuick testing, low-powered machines
baseFastModerate~142 MBCasual meetings, development
smallMediumGood~466 MBDaily use, most meetings
mediumSlowerBetter~1.5 GBImportant meetings, professional use
large-v3SlowestBest~3 GBMaximum accuracy, presentations
  1. Click the download button next to a model. A progress bar shows the download.
  2. Once downloaded, select the model - it's now your active transcription engine.
  3. Start recording. Text appears in real-time.

Example scenario: You download small (466 MB, takes ~2 minutes). You start recording a team standup. Words appear on screen as people speak. Everything runs on your Mac's M1 chip - your audio never touches the internet.

Privacy note at the bottom of settings confirms: "Audio is processed on your local server. No data leaves your machine."


Adding Custom Models from HuggingFace

The built-in models cover most needs, but if you want a specialized or community-tuned Whisper model, you can download one from HuggingFace.

Example: You found a Whisper model fine-tuned for medical terminology on HuggingFace and want to use it for transcribing doctor-patient consultations.

Steps:

  1. In the Whisper model manager, click Add from HuggingFace
  2. Paste a HuggingFace repository URL into the input field
    • Example: https://huggingface.co/ggerganov/whisper.cpp
    • You can also paste just the repo path: ggerganov/whisper.cpp
  3. Meetily scans the repository and shows a file picker listing all compatible model files:
    • Each file shows its name, size (e.g., "463 MB"), and a download icon
  4. Click the download icon next to the model you want
  5. A progress bar appears:
    • Shows percentage complete and MB downloaded (e.g., "234 MB / 463 MB")
    • A cancel button lets you stop the download at any time - partial files are cleaned up automatically
  6. Once complete, the model appears in your model list with:
    • An "HF" badge to distinguish it from built-in models
    • The model name, file size, and source URL
    • A green "Ready" indicator
    • Hover over the model card to reveal a delete button if you want to remove it later

If the repository is private or gated: Meetily shows an amber banner: "This model requires authentication." Enter your HuggingFace token (get one at huggingface.co/settings/tokens(opens in new tab)).

What if I pick the wrong file type? Meetily validates models automatically:

  • GGML files (.bin, .ggml) - accepted
  • GGUF files - rejected with message: "No GGML model files found. Note: GGUF models are not compatible with whisper.cpp."
  • F32 (unquantized) models - rejected (too large for real-time use, detected within the first few seconds of download)
  • PyTorch .bin files - rejected (wrong format, detected by file header)

Parakeet (Local)

  1. Select Parakeet from the provider dropdown
  2. Download the Parakeet model when prompted
  3. Start recording

Best for Windows machines with NVIDIA GPUs. Like Whisper, everything runs locally and your audio stays on your machine.

Example scenario: You have a Windows desktop with an RTX 4070. Parakeet uses NVIDIA's optimized engine for fast, accurate transcription without any cloud dependency.


OpenAI-Compatible (Self-Hosted Servers)

Use this when you run your own transcription server - on your computer, your home lab, or your office network.

Example scenario: You run Speaches(opens in new tab) on a spare Linux box with a GPU. You want Meetily on your Mac to send audio to that server for transcription.

Steps:

  1. Select OpenAI Compatible from the provider dropdown

  2. Enter your Server URL - this is the address where your server is running.

    • Examples: http://localhost:8000, http://192.168.1.50:8000, http://my-server.local:8080
    • Tip: Click the info icon (i) next to the URL field. A helper panel pops up with two columns:
    Local serversCloud providers
    Speaches → http://localhost:8000OpenAI → https://api.openai.com
    whisper.cpp → http://localhost:8080Groq → https://api.groq.com/openai
    Faster Whisper Server → http://localhost:8000Together AI → https://api.together.xyz
    vLLM → http://localhost:8000Fireworks AI → https://api.fireworks.ai

    Each URL is clickable - click one to auto-fill the field. There's also a copy button next to each URL. Clicking a URL from this panel immediately saves it and fetches available models.

  3. Enter an API Key (optional)

    • Most local servers (Speaches, whisper.cpp) don't need one - leave it blank
    • If your server uses authentication, paste the key here
  4. Wait for models to load - after entering the URL and clicking away, Meetily fetches available models from your server

    • A spinning icon appears next to the "Model" label while loading
    • If no URL is entered yet, you'll see a disabled dropdown with: "Enter server URL and API key to load models."
  5. Select a model from the dropdown

    • Models from your server appear automatically
    • If your model isn't listed, type its name - a "Use [your-model-name]" option appears
  6. Click Test Connection to verify everything works

What the URL field border tells you:

  • Green border + checkmark = your server is reachable and the model exists
  • Red border + X icon = something's wrong (see Troubleshooting below)

Privacy note at the bottom confirms: "Audio is processed on your local server. No data leaves your machine." (when using a localhost or local network URL)


Groq / OpenAI (Cloud)

Example scenario (Groq): You want the fastest possible transcription and don't mind sending audio to the cloud. You signed up at groq.com and got a free API key.

Example scenario (OpenAI): You already pay for OpenAI's API and want to use their Whisper model for maximum accuracy.

Steps:

  1. Select the provider (e.g., Groq) from the dropdown

  2. Enter your API Key

    • The field shows dots (like a password) - your key is hidden by default
    • Click the eye icon on the right to reveal the key, click again to hide it
    • After you type/paste your key and click away from the field, two things happen:
      1. The key is saved
      2. The field locks automatically - a lock icon appears, and the field becomes read-only
    • This lock prevents you from accidentally editing or deleting your key

    To edit a locked key:

    • Click the lock icon to unlock the field
    • If you click the text field itself (instead of the lock icon), the lock icon shakes in red to show you where to click

    Before you enter a key, you'll see a disabled model dropdown with: "Enter API key to load available models."

  3. Models load automatically after you save your API key

    • A spinning icon appears next to "Model" while fetching from the provider's API
    • Once loaded, the model dropdown activates
  4. Select a model from the dropdown. Models are organized into groups:

    GroupWhat it meansVisual indicator
    VerifiedCurated models tested to work well. Best choice for most users.Blue checkmark badge next to the name
    OthersAdditional models from the API. May work, but not tested by us.No badge
    CustomAppears when you type a name that doesn't match any listed model.Shows as "Use [name]" at the bottom

    Example for Groq: The "Verified" group shows whisper-large-v3 and whisper-large-v3-turbo with blue badges. Pick one.

    Example for a custom model: You type my-fine-tuned-whisper in the search box. Since it doesn't match any listed model, a "Use my-fine-tuned-whisper" option appears at the bottom. Click it.

  5. Click Test Connection to verify

Privacy Note

When using cloud providers, your audio is sent to external servers for transcription. The privacy note at the bottom of settings updates to reflect this: "Audio is sent to [provider] servers for transcription. API key stored locally only."


Connection Testing

For any provider with a server URL or API key (OpenAI-Compatible, Groq, OpenAI), always test your connection before your first recording.

What the test checks

  1. Can Meetily reach the server? It connects to the provider's /v1/models endpoint.
  2. Is your model available? It checks if your selected model exists on the server.
  3. How fast is it? It measures and reports latency.

Test results you might see

ResultWhat it meansWhat to do
Green border + "Connected (42ms)"Everything works. You're ready to record.Nothing - you're good!
"Connected successfully but model 'X' not found in 5 available models"Server is reachable, but doesn't have your model.Pick a different model from the dropdown.
Red border + "Connection failed"Can't reach the server at all.Check the URL, make sure the server is running.
Grayed-out button + "Enter API key first"Can't test without credentials.Enter your API key first.

Troubleshooting checklist

If the test fails, go through this list:

  1. Check the URL - is there a typo? Missing port number? (http://localhost:8000 not http://localhost)
  2. Is the server running? - try opening the URL in your browser. You should see a response (even if it's an error page).
  3. Is the API key correct? - try the key in the provider's own dashboard/playground first.
  4. Firewall or VPN? - local servers might be blocked by firewall rules. Cloud providers might be blocked by a corporate VPN.
  5. Wrong model? - the model name must exactly match what the server offers. Use the models that auto-populate rather than typing manually.

How It Works During Recording

When you press the record button, here's what happens behind the scenes:

Local providers (Whisper, Parakeet)

Your microphone → Meetily captures audio
                → Voice Activity Detection filters out silence
                → Only speech is sent to the Whisper/Parakeet engine on your machine
                → Text appears on screen in real-time
  • No internet used - everything happens on your CPU/GPU
  • Silence is skipped - VAD (Voice Activity Detection) filters out pauses, keyboard typing, background noise. This makes transcription ~70% more efficient.
  • Real-time results - words appear as each audio chunk (~5 seconds) is processed

External providers (OpenAI-Compatible, Groq, OpenAI)

Your microphone → Meetily captures audio
                → Audio is split into chunks
                → Each chunk is sent as a WAV file to the provider's API
                → Provider returns text with timestamps
                → Text appears on screen
  • Prompt chaining keeps context - Meetily sends the last 50 words from the previous chunk as context for the next one. This means if someone says "...and the quarterly results show that-" at the end of one chunk and "-we exceeded our targets" at the start of the next, the transcription stays coherent.
  • Automatic retry - if a chunk fails (network blip, server error), Meetily retries with a simpler response format.
  • Your audio recording is always saved locally - even if the cloud provider has issues, you never lose the recording itself.

Where Settings Are Stored

All settings are stored locally on your machine. Nothing is sent to our servers.

SettingWhereDetails
Selected providerApp databaseWhich provider is active (e.g., "groq")
Selected modelApp databaseWhich model to use (e.g., "whisper-large-v3")
API keyApp database + config fileYour secret key, stored per-provider
Server URLConfig file + browser storageSaved per-provider, restored on switch

Key point: Each provider's credentials are completely isolated. Your Groq API key is never mixed with your OpenAI settings. When you switch between providers, each one loads its own saved configuration.


Real-World Examples

Example 1: Daily standups on a Mac

"I run a 15-minute standup every morning. I want fast, private transcription."

Setup: Whisper → small model → Start recording. Done. Audio stays on your Mac, transcription takes ~2 seconds per chunk on an M1.

Example 2: Long client calls with cloud speed

"I have hour-long client calls and want transcription without loading my laptop."

Setup: Groq → paste API key → select whisper-large-v3 → Test Connection → Start recording. Groq processes audio in the cloud, your laptop stays cool.

Example 3: Self-hosted for privacy compliance

"Our company policy says no audio can leave our network. I run Speaches on an internal server."

Setup: OpenAI-Compatible → Server URL: http://transcription-server.internal:8000 → Models load automatically → select model → Test Connection → Start recording. Audio goes to your internal server, never to the internet.

Example 4: Specialized medical transcription

"I found a Whisper model fine-tuned for medical terminology on HuggingFace."

Setup: Whisper → Add from HuggingFace → paste repo URL → pick the GGML model file → download → select it → Start recording. The custom model handles medical terms better than the default models.


Screenshots Guide

These are the key screens worth capturing for visual reference:

#What to captureWhy it helps
1Provider dropdown open - showing On-Device group (Whisper with "Recommended" badge, Parakeet, OpenAI-Compatible) and Cloud group (Groq, OpenAI)Shows the first thing users see
2Groq fully configured - API key locked, model selected with blue verified badge, privacy note visibleShows what a "ready" cloud setup looks like
3Server URL info popover - the two-column panel with clickable URLs and copy buttonsUsers need to know this helper exists
4Model dropdown open - showing Verified group (blue badges), Others group, and a Custom entryExplains the model organization
5API key locked vs unlocked - side by side showing the lock icon and the red shake animationCommon point of confusion
6Empty state before API key - disabled model dropdown showing "Enter API key to load available models"Shows what new users see first
7Connection test success - green URL border, checkmark, "Connected (42ms)"Shows what success looks like
8Connection test failure - red border, X icon, error messageShows what to look for when debugging
9HuggingFace file picker - list of model files with sizes and download buttonsThe HF flow is multi-step
10HuggingFace download in progress - progress bar, percentage, cancel buttonUsers need to know they can cancel
11Privacy note variants - "No data leaves your machine" vs "Audio is sent to Groq servers"Builds trust
12Recording-locked state - grayed out dropdown with toast messagePrevents confusion during recording

Frequently Asked Questions

If you're on a Mac, start with Whisper (it's the default and marked 'Recommended'). On Windows, start with Parakeet. If you want speed without using your computer's resources, try Groq (free tier available). See the Quick Start section above for a full decision guide.
No. One provider is active at a time. All recordings use the selected provider until you change it.
Only to download the model the first time. After that, everything runs fully offline.
No. Each provider's settings (API key, server URL, model) are saved independently. Switch away and back - everything is restored exactly as you left it.
It indicates the best provider for your operating system. Whisper is recommended on macOS (uses Apple's Metal GPU), Parakeet is recommended on Windows (uses NVIDIA's optimized engine).
OpenAI is OpenAI's official cloud API at api.openai.com. Requires an API key, sends audio to OpenAI's servers. OpenAI-Compatible is any server that speaks the same protocol - could be a server on your own computer (free, no API key, audio stays local), a server on your office network, or a third-party service. Think of it this way: 'OpenAI' is a specific restaurant. 'OpenAI-Compatible' is any restaurant that serves the same menu.
Yes. Type the model name in the search box. A 'Use [your-model-name]' option appears at the bottom - click it. This is useful for custom fine-tuned models or new models your server supports.
Yes. API keys are stored locally on your machine in the app's database. They are never sent to Meetily's servers, only sent to the specific provider you configured, hidden (masked) by default in the UI, and locked after entry to prevent accidental edits.
Switching engines mid-recording would cause gaps or inconsistencies in the transcript. Finish your current recording first, then switch.
Meetily retries failed chunks with a simpler response format. If the provider stays unreachable, some text chunks may be lost - but your audio recording is always saved locally regardless. You can re-transcribe the audio later.
The server is reachable but doesn't have the model you selected. Open the model dropdown and pick one of the models that auto-populated from the server. Those are the ones actually available.
Click the lock icon (not the text field). The lock icon is on the right side of the input. Clicking the text field itself just makes the lock shake in red - that's the app telling you to click the lock instead.

Ready to get started?

Download Meetily and start transcribing your meetings locally with full privacy.

Have questions? or join our GitHub community