---
source_path: "models/types/vad.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/vad/"
---

# VAD

Models of this type find speech segments in audio data streams.

Wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) and filenames that
by convention match `vad-*.snsr`

**Also see these related items:** [VAD models](https://doc.sensory.com/tnl/7.8/models/index.md#vad-models) included in this distribution.

## Operation

```mermaid
flowchart TD
    start((start))
    fetch0[/samples from ->audio-pcm/]
    fetch1[/samples from ->audio-pcm/]
    audio0(^sample-count)
    audio1(^sample-count)

    silence(^silence)
    begin(^begin)
    END(^end)
    limit(^limit)

    process0[process]
    process1[process]
    out[\samples to <-audio-pcm\]

    final@{ shape: f-circ }

    start --> fetch0
    fetch0 --> audio0
    audio0 --> process0
    process0 --> fetch0
    process0 -->|speech start| begin
    process0 -->|timeout| silence
    silence --> final

    begin --> fetch1
    fetch1 --> audio1
    audio1 --> out
    out --> process1
    process1 --> fetch1
    process1 -->|speech end| END
    process1 -->|speech limit| limit
    END --> final
    limit --> final

    final --> fetch0
```

Endpointing flow.

1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. If speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms continue at step 6.
4. If _no_ speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence)
   and restart from step 1.
5. Continue processing at step 1 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).
6. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin).
7. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
8. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
9. If [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `== 1` write speech samples to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out).
10. If end detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end)
    and restart from step 1.
11. If end _not_ detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit)
    and restart from step 1.
12. Continue processing at step 7 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).

Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.

## Settings

**Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence)

**Available iterators:** _none_

**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)

**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms), [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample)

**Available configuration settings:** [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through), [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence)

**Available values:** [vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad)

**Also see these related items:** [live-segment.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment.md#live-segment-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)

<!-- Abbreviation definitions from includes/abbreviations.md -->
*[API]: Application Programming Interface
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
