camenduru
/

NeMo

Model card Files Files and versions Community

NeMo / docs /source /asr /ssl /models.rst

camenduru's picture

thanks to NVIDIA ❤

7934b29 about 2 years ago

history blame contribute delete

834 Bytes

	Models
	======

	End-to-End ASR models are typically of encoder-decoder style, where the encoder does acoustic
	modeling i.e., converting speech wavform into features, and the decoder converts those features into
	text. Encoder contains the bulk of trainable parameters and is usually the focus of SSL in ASR.
	Thus, any architecture that can be used as encoder in ASR models can be pre-trained using SSL. For an
	overview of model architectures that are currently supported in NeMo's ASR's collection, refer
	to `ASR Models <../models.html>`__. Note that SSL also uses encoder-decoder style of models. During
	down-stream fine-tuning, the encoder is retained where as the decoder (used during SSL) is replaced
	with down-stream task specific module. Refer to `checkpoints <./results.html>`__ to see how this is
	accomplished in NeMo.