Checkpoints =========== There are two main ways to load pretrained checkpoints in NeMo as introduced in `loading ASR checkpoints <../results.html#checkpoints>`__. In speaker diarization, the diarizer loads checkpoints that are passed through the config file. For example: Loading Local Checkpoints --------------------------- Load VAD models .. code-block:: bash pretrained_vad_model='/path/to/vad_multilingual_marblenet.nemo' # local .nemo or pretrained vad model name ... # pass with hydra config config.diarizer.vad.model_path=pretrained_vad_model Load speaker embedding models .. code-block:: bash pretrained_speaker_model='/path/to/titanet-l.nemo' # local .nemo or pretrained speaker embedding model name ... # pass with hydra config config.diarizer.speaker_embeddings.model_path=pretrained_speaker_model Load neural diarizer models .. code-block:: bash pretrained_neural_diarizer_model='/path/to/diarizer_msdd_telephonic.nemo' # local .nemo or pretrained neural diarizer model name ... # pass with hydra config config.diarizer.msdd_model.model_path=pretrained_neural_diarizer_model NeMo will automatically save checkpoints of a model you are training in a `.nemo` format. You can also manually save your models at any point using :code:`model.save_to(.nemo)`. Inference --------- .. note:: For details and deep understanding, please refer to ``/tutorials/speaker_tasks/Speaker_Diarization_Inference.ipynb``. Check out :doc:`Datasets <./datasets>` for preparing audio files and optional label files. Run and evaluate speaker diarizer with below command: .. code-block:: bash # Have a look at the instruction inside the script and pass the arguments you might need. python /examples/speaker_tasks/diarization/offline_diarization.py NGC Pretrained Checkpoints -------------------------- The ASR collection has checkpoints of several models trained on various datasets for a variety of tasks. These checkpoints are obtainable via NGC `NeMo Automatic Speech Recognition collection `_. The model cards on NGC contain more information about each of the checkpoints available. In general, you can load models with model name in the following format, .. code-block:: python pretrained_vad_model='vad_multilingual_marblenet' pretrained_speaker_model='titanet_large' pretrained_neural_diarizer_model='diar_msdd_telephonic' ... config.diarizer.vad.model_path=retrained_vad_model \ config.diarizer.speaker_embeddings.model_path=pretrained_speaker_model \ config.diarizer.msdd_model.model_path=pretrained_neural_diarizer_model where the model name is the value under "Model Name" entry in the tables below. Models for Speaker Diarization Pipeline ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. csv-table:: :file: data/diarization_results.csv :align: left :widths: 30, 30, 40 :header-rows: 1