Word-by-word match against the target
Read a passage aloud. We compare what you said against what you should have said, word by word, and color-code the result: matched, substituted, or skipped. No guesswork about what went wrong.
Read English passages aloud, get word-by-word accuracy scoring, IPA-level pronunciation hints, and targeted drills for the sounds you keep getting wrong. Free, voice-to-voice, with audio playback so you can hear yourself vs the reference.
Start practicing — freeNo credit card. Sign in with Google.
Read a passage aloud. We compare what you said against what you should have said, word by word, and color-code the result: matched, substituted, or skipped. No guesswork about what went wrong.
If you said 'tree' instead of 'three', you'll see /θriː/ with a plain-English mouth-shape instruction: 'TH-ree — place your tongue between your teeth and push air'. Hear the reference, repeat, drill.
Reading too fast = words run together. Too slow = robotic. The pace score shows where you drifted. Pitch and volume variance tell you if you sounded confident or monotone.
Pronunciation is the part of English that books cannot teach. You can read every grammar rule, memorize 10,000 words, and still be misunderstood the moment you say 'three' as 'tree' or 'asked' as 'axed'. The fix is the boring part: read aloud, hear what you actually said, fix the specific sound, repeat.
Most apps try to teach pronunciation through repeat-after-me drills with a single static score at the end. That doesn't work because (a) you don't know which sound was wrong, and (b) you're not learning the underlying mechanics. fluentwith's Reading mode does it differently. You pick a passage matched to your level. The teleprompter scrolls at a target words-per-minute. You read aloud. We record everything.
When you finish, the system runs three things in parallel. First, the audio gets transcribed by Whisper — an accurate speech-to-text model that respects what you actually said, not what you should have said. Second, we diff your transcript against the target text, marking each word as matched, substituted, or skipped. Third, for every substituted word, we generate an IPA transcription, a plain-English mouth-shape instruction, and an explanation of why speech-to-text misheard you (often a hint about which sound to focus on).
Common patterns the system catches: 'th' substitutions (the unvoiced /θ/ in 'three' becoming /t/ or /f/, the voiced /ð/ in 'this' becoming /d/ or /z/); v/w confusion ('very' vs 'wery'), common in Slavic and Indian English speakers; final consonant cluster simplification ('asked' → 'ast', 'films' → 'fim'); l/r distinction; word-stress placement.
Beyond the pronunciation report, you get a pace score (how close to target WPM you read), a confidence score (pitch and volume variation), and the audio of your reading saved locally so you can play it back and compare to the reference voice. Over time, the recurring-mistake detector spots that you keep mispronouncing, say, the 'th' sound — and queues minimal-pair drills targeting it specifically.
Free, no credit card. Pick a passage and start in 30 seconds.
We use Whisper (an accurate speech-to-text model) to transcribe what you actually said, then compare it word-by-word against the target passage. If you said 'tree' when the passage said 'three', the system flags that as a substitution and generates a pronunciation hint. It's not phoneme-level scoring like Azure Pronunciation Assessment, but it's accurate at the word-level for actionable feedback.
AI can help you target specific pronunciation patterns that interfere with intelligibility — the 'th' sound, final consonant clusters, vowel length, word stress. It can't (and shouldn't try to) make you sound like a native — accent reduction in that sense is rarely the right goal. The goal is to be understood clearly, and that's what fluentwith's pronunciation feedback targets.
We have a curated library across three difficulty levels (beginner, intermediate, advanced) and eight topics — story, news, business, casual conversation, motivational, dialogue, tech, travel. Each passage has a recommended target words-per-minute (WPM) calibrated to its difficulty.
Yes. Every word the system flags as substituted gets an IPA transcription (e.g. /θriː/ for 'three'), a plain-English mouth-shape instruction ('place tongue between teeth and push air'), and an explanation of what the substitution suggests about your pronunciation. Tap to hear a reference voice say the word.
Yes. Every reading session saves the audio locally on your device. The report page has a 'Hear yourself' player and a 'Correct version' button (browser TTS at the target speed) so you can compare the two side by side.
Pick any passage with frequent 'th' words ('three', 'think', 'with', 'them') and read it aloud. The system flags every substitution. Over multiple sessions, the recurring-mistake detector notices the pattern and queues minimal-pair drills ('three vs tree', 'this vs dis') in your daily drill feed.
More ways to practice English speaking with AI: all modes · interview English practice · business English · pronunciation practice