Question 1

Does transcription happen on my device?

Accepted Answer

Yes. Everything runs in your browser. Your audio files are never sent to any server.

Question 2

How accurate is the transcription?

Accepted Answer

Using the Whisper small model, accuracy is typically 90–95% for clear speech in supported languages. Accuracy decreases with heavy accents or background noise.

Question 3

How long does the first load take?

Accepted Answer

The Whisper small model is approximately 150 MB. Download time depends on your connection speed. It is cached after the first download.

Question 4

Which languages are supported?

Accepted Answer

All 99 languages supported by the Whisper model, including English, Japanese, Spanish, French, German, Chinese, Arabic, and many more.

Question 5

What is the maximum file size?

Accepted Answer

Files up to 500 MB are supported. For very large files, processing may take longer depending on your device.

Question 6

What is an SRT file? Can I use it for YouTube subtitles?

Accepted Answer

SRT is a subtitle file format that includes timestamps for each line of text. You can upload SRT files directly to YouTube's subtitle feature, making it easy to add accurate captions to your videos.

Question 7

What is speaker diarization?

Accepted Answer

Speaker diarization automatically identifies and labels different speakers in an audio file — for example "Speaker 1" and "Speaker 2" — so you can tell who said what. This feature is currently in beta.

Question 8

Can I use it offline after the first load?

Accepted Answer

Yes. After the model is downloaded on first use, it is cached in your browser. From the second visit onwards, transcription works fully offline.

Question 9

How accurate is Japanese transcription?

Accepted Answer

Using OpenAI's Whisper small model, accuracy is 90% or higher for clear Japanese speech. Technical terms and regional dialects may reduce accuracy.

Question 10

Can I transcribe meeting recordings?

Accepted Answer

Yes. MP3, WAV, M4A, and FLAC formats are supported. Recordings from Zoom, Google Meet, and other platforms can be uploaded directly.

Question 11

Does it work on smartphones?

Accepted Answer

Yes. Since it is browser-based, it works on smartphones running Chrome or Safari. A Wi-Fi connection is recommended for the initial model download.

Question 12

Is my audio saved on a server?

Accepted Answer

No. All processing happens entirely within your browser. Your audio data is never sent to any external server.

	Zeraku	Service A	Service B
Completely free	✓	△Up to 3/month	△Up to 10 min
Privacy (no data upload)	✓Browser-only	✗Server upload	✗Server upload
No account required	✓	✗Required	✗Required
Works offline (2nd visit+)	✓Cached model	✗Always online	✗Always online
Supported languages	99言語Auto-detect	58言語	100言語+Paid plan only
SRT/VTT subtitle export	✓Free	✓Paid plan only	✓Paid plan only
Speaker diarization	△Beta	✓Paid plan only	✓Paid plan only
Max file size (free)	500MB	25MBFree plan	100MBFree plan

Transcription Audio

What is Transcription Audio?

Key Features

How It Works

Load Model

Upload Audio

Transcription

Review & Export

Who Is This For?

Why Use Transcription Audio?

How Zeraku Compares to Cloud Services

Beginner's Guide

Technical Details

Frequently Asked Questions

Related Tools

Suppresseur de Bruit Audio

Détecteur d'Images IA

Générateur d'Images Réseaux