No Examples Found
Whisper
This is an implementation of Whisper built on top of the two most best improvements to the original Whisper algorithm.
The biggest improvements are around hallucinations, timestamp accuracy, and ability to detect filler words.
Picking the right settings
decode_boost
is basically the option that switches between which implementation to use during the decoding process. Setting it to true will use thewhisper-timestamped
implementation and setting it to false will use thestable-ts
implementation.whisper-timestamped
is slower but more accurate specifically on timestamps and filler words whilestable-ts
is faster but less accurate on timestamps. However, it is key to note that both of these approaches make significant improvements to the timestamps you might get from the original Whisper implementation or other implementations such as WhisperX.- Enabling
speaker_diarization
returns speaker IDs for each word in the transcript. This is useful if you want to know who said what. - Enabling
speed_boost
will use smaller versions of Whisper with either decoding approach. This is useful if you want to get results faster and don't mind sacrificing some accuracy. Whendecode_boost
is set to true, thebase
model is used and whendecode_boost
is set to false, thewhisper-v3-turbo
model is used. initial_prompt
basically allows you to include uncommon words that might appear in your audio. This is useful if you want to improve the accuracy and spelling on these types of words.
Languages
Whisper supports 99 total languages. You may enter a language code into the language
parameter in case you know the language of the original audio already. If you don't know the language of the original audio, you may leave the language
parameter blank and Whisper will automatically detect the language of the original audio. If you want to see the full list of supported languages, you may refer to the table below.
en
(English)zh
(Chinese)de
(German)es
(Spanish)ru
(Russian)ko
(Korean)fr
(French)ja
(Japanese)pt
(Portuguese)tr
(Turkish)pl
(Polish)ca
(Catalan)nl
(Dutch)ar
(Arabic)sv
(Swedish)it
(Italian)id
(Indonesian)hi
(Hindi)fi
(Finnish)vi
(Vietnamese)he
(Hebrew)uk
(Ukrainian)el
(Greek)ms
(Malay)cs
(Czech)ro
(Romanian)da
(Danish)hu
(Hungarian)ta
(Tamil)no
(Norwegian)th
(Thai)ur
(Urdu)hr
(Croatian)bg
(Bulgarian)lt
(Lithuanian)la
(Latin)mi
(Maori)ml
(Malayalam)cy
(Welsh)sk
(Slovak)te
(Telugu)fa
(Persian)lv
(Latvian)bn
(Bengali)sr
(Serbian)az
(Azerbaijani)sl
(Slovenian)kn
(Kannada)et
(Estonian)mk
(Macedonian)br
(Breton)eu
(Basque)is
(Icelandic)hy
(Armenian)ne
(Nepali)mn
(Mongolian)bs
(Bosnian)kk
(Kazakh)sq
(Albanian)sw
(Swahili)gl
(Galician)mr
(Marathi)pa
(Punjabi)si
(Sinhala)km
(Khmer)sn
(Shona)yo
(Yoruba)so
(Somali)af
(Afrikaans)oc
(Occitan)ka
(Georgian)be
(Belarusian)tg
(Tajik)sd
(Sindhi)gu
(Gujarati)am
(Amharic)yi
(Yiddish)lo
(Lao)uz
(Uzbek)fo
(Faroese)ps
(Pashto)tk
(Turkmen)nn
(Nynorsk)mt
(Maltese)sa
(Sanskrit)lb
(Luxembourgish)my
(Myanmar)bo
(Tibetan)tl
(Tagalog)mg
(Malagasy)as
(Assamese)tt
(Tatar)haw
(Hawaiian)ln
(Lingala)ha
(Hausa)ba
(Bashkir)jw
(Javanese)su
(Sundanese)yue
(Cantonese)my
(Burmese)ca
(Valencian)nl
(Flemish)ht
(Haitian)lb
(Letzeburgesch)ps
(Pushto)pa
(Panjabi)ro
(Moldavian)si
(Sinhalese)es
(Castilian)zh
(Mandarin)