The default behavior if `panicFunc` is `NULL` is to print an error message to `stderr` and then abort the process. This is undesirable on small embedded platforms with either no OS, or a very lightweight RTOS, that run from a single memory image.

[config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports overriding license keys and the license key format is valid, [LICENSE_OVERRIDE_NOT_VALID](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) if the override key does not exist or did not pass validation, [LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported) if the SDK port does not support override keys, and [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled) if the existing library license key does not enable the override feature. The returned string value is reference-counted. You must call [release](https://doc.sensory.com/tnl/7.8/api/heap.md#release) on it before it goes out of scope or a memory leak will result. The returned value will be `NULL` if the named field does not exist in the license key.

**snsrConfig() parameters** ```c SNSR_API SnsrRC snsrConfig( SNSR_CONFIG_LICENSE, const char *field, const char *value ); ``` - **Input parameter:** `field`: The name of an override license key field. - **Output parameter:** `value`: The string value for `field`. Reference-counted. Must be [release](https://doc.sensory.com/tnl/7.8/api/heap.md#release)d. **Example** ```c const char *expires; SnsrRC r = snsrConfig(SNSR_CONFIG_LICENSE_INFO, "exp", &expires); ``` **Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license), [license keys](https://doc.sensory.com/tnl/7.8/reference/overview.md#license-keys) **`LICENSE_SUPPORT`**

Software license key override support. [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports overriding license keys, [LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported) if the SDK port does not include support for overriding keys, and [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled) if the existing library license key does not enable the override feature.

You may retrieve this value even if active library handles exist. **snsrConfig() parameters** ```c SNSR_API SnsrRC snsrConfig( SNSR_CONFIG_LICENSE_SUPPORT ); ``` **Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license), [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info) ## Heap allocators You can replace the heap memory allocator (`malloc()`, `realloc()`, `free()`) used by the TrulyNatural SDK with alternate implementations. This is most useful on small platforms where a standard library implementation is either not available, or not recommended, or RAM use must be strictly constrained. By default this library uses the dynamic memory allocation functions defined in `stdlib.h`. Built-in allocators include [allocBuddy](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocbuddy), [allocStdlib](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocstdlib), [allocTLSF](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloctlsf) (recommended), and an API for [creating your own](https://doc.sensory.com/tnl/7.8/api/library-config.md#user-defined-allocator). ### allocBuddy **C/C++** ```c SNSR_API const SnsrAlloc_Vmt * snsrAllocBuddy(void *poolStart, size_t poolSizeInBytes); ``` **Parameters and return value:** **Input parameter:** `poolStart` * Points to the start of a read-write memory segment to use as the allocator backing store. This address **must** be aligned to the word size of the CPU. **Input parameter:** `poolSizeInBytes` * the number of bytes available at `poolStart`. **Return value:** * A new custom allocator definition. Buddy allocator. This is an implementation of the [Buddy memory allocation][bmem] heap allocation algorithm. The size of returned blocks is the smallest power-of-two in which the requested block size fits. This allocator is fast and has low external fragmentation, but this comes at the expense of significant internal fragmentation (e.g. a 1025 byte request requires allocation of a 2048 byte block, wasting 1023 bytes). This is a customization of the [memsys5 allocator from SQLite][mem5] that is in the Public Domain. **Example:** ```c #define POOL_SIZE 128000 static size_t pool[POOL_SIZE / sizeof(size_t)]; // in main, before any other snsr* calls snsrConfig(SNSR_CONFIG_ALLOC, snsrAllocBuddy(pool, sizeof(pool))); ``` **Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [CONFIG_ALLOC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_alloc) ### allocLock **C/C++** ```c SNSR_API const SnsrAlloc_Vmt * snsrAllocLock(const SnsrAlloc_Vmt *vmt); ``` **Parameters and return value:** **Input parameter:** `vmt` * A custom allocator definition. **Return value:** * A new custom allocator definition that is thread-safe. Thread-safe allocator wrapper. This takes a [heap allocator](https://doc.sensory.com/tnl/7.8/api/library-config.md#heap-allocators) and adds mutual exclusion locks to make it thread-safe. This wrapper has no effect if the TrulyNatural SDK does not have thread support on the target platform: The function adds the lock wrapper only if `snsrConfig(SNSR_CONFIG_THREAD_SUPPORT) == SNSR_RC_OK`. **Example:** ```c #define POOL_SIZE 128000 static size_t pool[POOL_SIZE / sizeof(size_t)]; // in main, before any other snsr* calls snsrConfig(SNSR_CONFIG_ALLOC, snsrAllocLock(snsrAllocTLSF(pool, sizeof(pool)))); ``` **Also see these related items:** [allocBuddy](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocbuddy), [allocStdlib](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocstdlib), [allocTLSF](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloctlsf) ### allocPerf **C/C++** ```c SNSR_API const SnsrAlloc_Vmt * snsrAllocPerf(const SnsrAlloc_Vmt *vmt); ``` **Parameters and return value:** **Input parameter:** `vmt` * A custom allocator definition. **Return value:** * A new custom allocator definition that gathers allocation statistics. Statistics-gathering allocator wrapper. Adds instrumentation to an allocator to determine the heap high-water mark, number of allocations, internal fragmentation overhead, etc. **Warning:** This wrapper adds mutual exclusion locks to the wrapped allocator. Do not use with [allocLock](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloclock), as that will result in deadlock or undefined behavior. **Note:** type: example ```c // in main, before any other snsr* calls snsrConfig(SNSR_CONFIG_ALLOC, snsrAllocPerf(snsrAllocStdlib())); // application code snsrTearDown(); snsrAllocPerfStats(snsrStreamFromFILE(stdout, SNSR_ST_MODE_WRITE)); ``` **Also see these related items:** [allocBuddy](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocbuddy), [allocStdlib](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocstdlib), [allocTLSF](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloctlsf), [allocPerfStats](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperfstats) ### allocPerfStats **C/C++** ```c SNSR_API size_t snsrAllocPerfStats(SnsrStream out); ``` **Parameters and return value:** **Input parameter:** `out` * A writable [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) to receive allocator statistics, or `NULL` to produce no output. **Return value:** * The smallest heap pool size required to repeat the instrumented allocation run. Show allocator statistics Returns the smallest heap pool size that could be sufficient to repeat the instrumented allocation run. To reduce the chance of out-of-heap errors, allocate a pool that is at least 10% larger than this minimum. If the `out` [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) not `NULL`, this function writes heap allocator statistics in human-readable form to this stream. **Note:** Requires use of [allocPerf](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperf) to gather statistics. **Also see these related items:** [allocPerf](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperf) ### allocStdlib **C/C++** ```c SNSR_API const SnsrAlloc_Vmt * snsrAllocStdlib(void); ``` **Parameters and return value:** **Return value:** * A new custom allocator definition. Standard C library allocator. This is the standard library allocator: `malloc()`, `realloc()`, and `free()`. It is the default heap allocator used unless overridden with [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config). **Warning:** This allocator is not thread-safe. If your application runs TrulyNatural SDK code from more than one execution thread you must add mutual exclusion locking by wrapping this allocator with [allocLock](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloclock). Whether this allocator works with [allocPerf](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperf) depends on whether the standard C library for the platform has a `malloc_usable_size()` or `malloc_size()` implementation available. If in doubt, use [allocTLSF](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloctlsf) for performance measurement instead. **Example:** ```c // in main, before any other snsr* calls snsrConfig(SNSR_CONFIG_ALLOC, snsrAllocStdlib()); ``` **Also see these related items:** [allocLock](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloclock), [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [CONFIG_ALLOC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_alloc) ### allocTLSF - recommended **C/C++** ```c SNSR_API const SnsrAlloc_Vmt * snsrAllocTLSF(void *poolStart, size_t poolSizeInBytes); ``` **Parameters and return value:** **Input parameter:** `poolStart` * Points to the start of a read-write memory segment to use as the allocator backing store. This address **must** be aligned to the word size of the CPU. **Input parameter:** `poolSizeInBytes` * the number of bytes available at `poolStart`. **Return value:** * A new custom allocator definition. TLSF allocator. The [Two-level Segregated Fit][tlsf] allocator is recommended for embedded systems. It has `O(1)` cost for most operations, low overhead, low internal fragmentation, and supports multiple backing store pools for the heap. This a customization of a [TLSF library][tlsfv3] that is in the Public Domain. **Warning:** This allocator is not thread-safe. If your application runs TrulyNatural SDK code from more than one execution thread you must add mutual exclusion locking by wrapping this allocator with [allocLock](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloclock). **Example:** ```c #define POOL_SIZE 128000 static size_t pool[POOL_SIZE / sizeof(size_t)]; // in main, before any other snsr* calls snsrConfig(SNSR_CONFIG_ALLOC, snsrAllocTLSF(pool, sizeof(pool))); ``` **Also see these related items:** [allocLock](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloclock), [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [CONFIG_ALLOC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_alloc) ## User-defined allocator You can create your own custom [heap allocator](https://doc.sensory.com/tnl/7.8/api/library-config.md#heap-allocators) by implementing the functions defined in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt), and then calling [`snsrConfig(SNSR_CONFIG_ALLOC, &vmt)`](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_alloc). ### alloc_Vmt **C/C++** ```c typedef struct { void *(*malloc)(void *ctx, size_t size); // (1)! void (*free)(void *ctx, void *ptr); // (2)! void *(*realloc)(void *ctx, void *ptr, size_t size); // (3)! size_t (*size)(void *ctx, void *ptr); // (4)! size_t (*roundUp)(void *ctx, size_t size); // (5)! size_t (*minPoolSize)(void *ctx, size_t maxAlloc, size_t maxCount); // (6)! SnsrAllocRC (*addPool)(void *ctx, void *pool, size_t size); // (7)! SnsrAllocRC (*setUp)(void *ctx); // (8)! SnsrAllocRC (*tearDown)(void *ctx); // (9)! void *ctx; // Allocator context (10) } SnsrAlloc_Vmt; ``` 1. _(required)_ **Also see these related items:** [malloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-malloc) 2. _(required)_ **Also see these related items:** [free](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-free) 3. _(required)_ **Also see these related items:** [realloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-realloc) 4. _(optional)_ **Also see these related items:** [size](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-size) 5. _(optional)_ **Also see these related items:** [roundUp](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-roundUp) 6. _(optional)_ **Also see these related items:** [minPoolSize](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-minPoolSize) 7. _(optional)_ **Also see these related items:** [addPool](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-addPool) 8. _(optional)_ **Also see these related items:** [setUp](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-setUp) 9. _(optional)_ **Also see these related items:** [tearDown](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-tearDown) 10. _(required)_ `ctx` should point to the custom allocator data structure. It is passed to each implementation method as the first argument. User-defined allocator virtual method table. This method table defines a heap allocator to use in place of `malloc()`, `realloc()`, and `free()` from `stdlib.h`. The struct must remain valid until a call to [tearDown](https://doc.sensory.com/tnl/7.8/api/heap.md#teardown), or the application ends. **Example:** This minimal example wraps the standard C library allocator. ```c static void * vmtMalloc(void *ctx, size_t size) { return malloc(size); } static void * vmtRealloc(void *ctx, void *ptr, size_t size) { return realloc(ptr, size); } static void vmtFree(void *ctx, void *ptr) { free(ptr); } static const SnsrAlloc_Vmt CustomAlloc = { vmtMalloc, vmtFree, vmtRealloc, NULL, NULL, NULL, NULL, NULL, NULL, NULL }; // In main, before any other snsr* calls snsrConfig(SNSR_CONFIG_ALLOC, &CustomAlloc); ``` **Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [CONFIG_ALLOC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_alloc), [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc), [malloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-malloc), [free](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-free), [realloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-realloc), [size](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-size), [roundUp](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-roundUp), [minPoolSize](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-minPoolSize), [addPool](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-addPool), [setUp](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-setUp), [tearDown](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-tearDown) #### malloc - required **C/C++** ```c void *(*malloc)(void *ctx, size_t size); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `size` * Number of bytes to allocate. `size > 0`. **Return value:** * A pointer to an aligned segment of writable memory, or `NULL` if memory could not be allocated. Allocate memory from the heap. This method should work like the standard library's `malloc()` function. It should return a pointer to at least `size` bytes of writable memory, aligned to at least the word size of the CPU. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) #### free - required **C/C++** ```c void (*free)(void *ctx, void *ptr); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `ptr` * Pointer to previously allocated heap segment, as returned by [malloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-malloc), or [realloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-realloc). Free memory allocated with the malloc or realloc methods. This method should work like the standard library's `free()` function. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) #### realloc - required **C/C++** ```c void *(*realloc)(void *ctx, void *ptr, size_t size); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `ptr` * Pointer to previously allocated heap segment, as returned by [malloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-malloc), or [realloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-realloc). **Input parameter:** `size` * Number of bytes to allocate. **Return value:** * A pointer to an aligned segment of writable memory, or `NULL` if memory could not be allocated. Allocate memory from the heap. This method should work like the standard library's `malloc()` function. It should return a pointer to at least `size` bytes of writable memory, aligned to at least the word size of the CPU. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) #### size - optional **C/C++** ```c size_t (*size)(void *ctx, void *ptr); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `ptr` * Pointer to previously allocated heap segment, as returned by [malloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-malloc), or [realloc](https://doc.sensory.com/tnl/7.8/api/library-config.md#vmt-realloc). **Return value:** * The size of the memory block `ptr` points to. Return the size of the ptr allocation. This method should work like `malloc_size()` or `malloc_usable_size()`. It should return the actual size of the block of memory allocated for `ptr`. This will typically be a bit larger than the requested size. **Note:** This method is optional and used only by [allocPerf](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperf). If the allocation size is not available, set this method to `NULL` and avoid using [allocPerf](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperf). **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) #### roundUp - optional **C/C++** ```c size_t (*roundUp)(void *ctx, size_t size); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `size` * Request size in bytes. **Return value:** * The size of the block that would be allocated for request `size`. Return the allocation size for a given request size. This returns the size of the block that would be allocated when requesting a `size` byte segment. **Note:** This method is optional. If set to `NULL`, the block size will be the requested `size` rounded up to the next multiple of 8. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) #### minPoolSize - optional **C/C++** ```c size_t (*minPoolSize)(void *ctx, size_t maxAlloc, size_t maxCount); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `maxAlloc` * The high water mark of bytes allocated by the instrumented application. **Input parameter:** `maxCount` * The high water mark of the number of allocations made by the application. **Return value:** * An estimate of the smallest backing store pool that could satisfy `maxAlloc` and `maxCount`. Return an estimate of the pool size required. This method returns an estimate of the smallest allocator pool required to repeat an instrumented run. **Note:** This method is optional. If set to `NULL`, the estimated pool size will not be available in [allocPerfStats](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocperfstats) **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) #### addPool - optional **C/C++** ```c SnsrAllocRC (*addPool)(void *ctx, void *pool, size_t size); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Input parameter:** `pool` * `void *` to the start of a read-write memory segment to use as additional allocator store. This address must be aligned to the word size of the CPU. **Input parameter:** `size` * The number of bytes available at `pool`. **Return value:** * [ALLOC_RC_OK](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_rc_ok) upon success, any other [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc) on failure. Add a new backing store pool to the allocator. This method adds a new pool to the backing store used for the heap. **Note:** This method is optional. If set to `NULL`, the allocator does not support adding pools to the heap store. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt), [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc) #### setUp - optional **C/C++** ```c SnsrAllocRC (*setUp)(void *ctx); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Return value:** * [ALLOC_RC_OK](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_rc_ok) upon success, any other [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc) on failure. Initialize the memory allocator. This method is called before any other to initialize the allocator. **Note:** This method is optional. If set to `NULL`, no additional initialization is done. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt), [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc) #### tearDown - optional **C/C++** ```c SnsrAllocRC (*tearDown)(void *ctx); ``` **Parameters and return value:** **Input parameter:** `ctx` * The value of the `ctx` member in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Return value:** * [ALLOC_RC_OK](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_rc_ok) upon success, any other [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc) on failure. Shut down the memory allocator. This method should deallocate any resources this allocator initialized. **Note:** This method is optional. If set to `NULL`, no additional deallocation is done. **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt), [AllocRC](https://doc.sensory.com/tnl/7.8/api/library-config.md#allocrc) ### AllocRC **C/C++** ```c typedef enum { SNSR_ALLOC_RC_{name}, ... } SnsrAllocRC; // Where {name} is from the table below, e.g.: SNSR_ALLOC_RC_OK ``` This is the return code used by the implementation methods in [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt). **Also see these related items:** [alloc_Vmt](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloc_vmt) **`OK`** Success. **`ERROR`** Unspecified failure. **`ALLOCATOR_EXISTS`** Allocator already configured. **`ALLOC_FAILED`** Out of heap memory. **`NO_FUNC`** Required implementation method is `NULL`. **`NOT_SUPPORTED`** Not supported by this allocator. **C/C++** ```c ``` **Parameters and return value:** **Input parameter:** `s` * [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) handle. **Return value:** * [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) for success, any other value indicates failure. **Java** ```java ``` **Parameters and return value:** **Return value:** * The same [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) instance, for method chaining. --> [bmem]: https://en.wikipedia.org/wiki/Buddy_memory_allocation "Buddy memory allocation" [mem5]: https://www.sqlite.org/malloc.html#memsys5 "Zero-malloc memory allocator" [printf()]: https://en.cppreference.com/w/c/io/fprintf "printf() standard C library function" [tlsf]: http://www.gii.upv.es/tlsf/main/docs "Two-level Segregated Fit allocator" [tlsfv3]: https://github.com/OlegHahm/tlsf/tree/27c65c2966dfc6f7055fbeaf63141ce39cc1f16c "TLSF library v3" [undefined-behavior]: https://en.wikipedia.org/wiki/Undefined_behavior "Undefined program behavior" *[API]: Application Programming Interface *[iff]: if, and only if *[RAM]: Random Access Memory *[RTOS]: Real-Time Operating System *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/overview.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/overview/" --- # API overview This is a brief overview of the [API design goals](https://doc.sensory.com/tnl/7.8/api/overview.md#design-goals), the SDK's [conceptual model](https://doc.sensory.com/tnl/7.8/api/overview.md#conceptual-model), and the two supported [audio processing modes](https://doc.sensory.com/tnl/7.8/api/overview.md#processing-modes). ## Design goals The TrulyNatural SDK API is a result of these design goals: * Pure C implementation. * Lowest common denominator, widest toolchain availability. * No C++ runtime overhead. * Fast. * Simple API. * Small footprint: limited number of functions and data types. * Generic, independent of the inference task. * Fundamental data types only: floating point, integer, strings, streams, and opaque object instance handles. * Make it easier to provide bindings for languages other than C. * Flexible configuration. * Hide complexity, * but still allow for fine-grained configuration if needed. * Settings indexed by string names, documented settings define public API. * One self-contained model per task. * Model includes a flow graph that specifies how various low-level internal modules (feature extractors, acoustic models, etc.) connect and interact. * Includes all required module configurations. * Run on a wide variety of platforms, including ones without file system support. There is a significant downside to these design choices: Discoverability is very limited. You cannot determine model behavior from function or method names alone. You must refer to the [model type](https://doc.sensory.com/tnl/7.8/models/types/index.md#model-types) documentation for expected task behavior and [available settings](https://doc.sensory.com/tnl/7.8/api/setting-keys/index.md#setting-keys). ## Conceptual model This library uses a [dataflow][] approach to evaluate speech recognition tasks. It uses [inversion of control][]: The SDK invokes [event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) handlers to report [results](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#results) and control task flow. The API contains two primary data types: [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) used for model [inference](https://doc.sensory.com/tnl/7.8/api/inference.md#inference), and a [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) abstraction for [input and output](https://doc.sensory.com/tnl/7.8/api/io.md#input-and-output). Sessions hold the entire state of a model instance, and use streams for all input and output. There is, for example, a single [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load) function to load a model into a session, but this supports loading from a [named file](https://doc.sensory.com/tnl/7.8/api/io.md#fromfilename), an open [FILE *](https://doc.sensory.com/tnl/7.8/api/io.md#fromfile) handle, a [memory](https://doc.sensory.com/tnl/7.8/api/io.md#frommemory) segment, from the [code](https://doc.sensory.com/tnl/7.8/api/io.md#fromcode) segment, and from compressed [assets](https://doc.sensory.com/tnl/7.8/api/io.md#fromasset) on Android. Models (`snsr` files) define flow pipelines and session behavior. These contain the serialized content[^1] of the session flow graph, including all binary models and configurations. Think of these as hierarchical key-value databases. Once loaded into a [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), you can query or change the [setting keys](https://doc.sensory.com/tnl/7.8/api/setting-keys/index.md#setting-keys) with generic [getter](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) and [setter](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) functions. [^1]: Similar in concept to [protocol buffers][], but with streamed unpacking into native data structures in RAM, no need for accessor functions, and additional features such as conversion to [code](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_source) for running from the text segment. ## Processing modes Two modes are available for audio processing: * **Pull** mode, where the [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) function reads audio from the configured input stream. This blocks on read until new data are available. The [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) function returns only when the stream runs out of data (for example the end of a file), an [event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) handler tells it to stop, or an error occurs. * **Push** mode, where the application repeatedly calls the [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) function with small chunks of the audio data. The [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) function returns once it has processed or buffered these data. The application eventually calls [stop](https://doc.sensory.com/tnl/7.8/api/inference.md#stop) to flush and process any buffered data. Model evaluation typically follows this recipe: **Pull mode** 1. Create a new session instance with [new](https://doc.sensory.com/tnl/7.8/api/inference.md#new). 2. [Load](https://doc.sensory.com/tnl/7.8/api/inference.md#load) a task model into the instance. 3. [Set](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) the [input](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm) source stream. 4. [Register](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) one or more [event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) handlers. 5. Enter the main loop by calling [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run). The library will process the input streams and invoke event handlers at appropriate times. The main loop continues until a terminating condition is reached, such as an event returning an error code. 6. [Release](https://doc.sensory.com/tnl/7.8/api/heap.md#release) the session instance. **Also see these related items:** [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program), [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spotc), [evalUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/evalUDT.md#evaludtjava), and [hello_world.py](https://doc.sensory.com/tnl/7.8/api/sample/python/hello_world.md#hello_worldpy). **Push mode** 1. Create a new session instance with [new](https://doc.sensory.com/tnl/7.8/api/inference.md#new). 2. [Load](https://doc.sensory.com/tnl/7.8/api/inference.md#load) a task model into the instance. 3. [Register](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) one or more [event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) handlers. 4. Process audio segments by calling [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) repeatedly. This will invoke event handlers before `push` returns. 5. Call [stop](https://doc.sensory.com/tnl/7.8/api/inference.md#stop) once to flush any buffered audio. 6. [Release](https://doc.sensory.com/tnl/7.8/api/heap.md#release) the session instance. **Also see these related items:** [push-audio.c](https://doc.sensory.com/tnl/7.8/api/sample/c/push-audio.md#push-audioc) and [stt_push.py](https://doc.sensory.com/tnl/7.8/api/sample/python/stt_push.md#stt-push-py). ## Language bindings This version of the TrulyNatural SDK supports three language bindings: [C][], [Java][], and [Python][]. C is the native API; Java and Python are generated wrappers with idiomatic naming and error handling on top of the same session and stream model. ### C The C binding exposes the native API directly. Functions use the `snsr` prefix and pass opaque handles explicitly, for example `snsrSetHandler(s, key, callback)`. The C binding uses a *latched-error* model: every function returns [SnsrRC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), and once a session enters an error state, every subsequent call short-circuits with the same code until [clearRC](https://doc.sensory.com/tnl/7.8/api/inference.md#clearrc) (or [reset](https://doc.sensory.com/tnl/7.8/api/inference.md#reset)) is called. Read the latched code with [rC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) and the human-readable detail with [errorDetail](https://doc.sensory.com/tnl/7.8/api/inference.md#errordetail). The C binding uses [reference counting](https://doc.sensory.com/tnl/7.8/api/heap.md#memory-management) on every [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream), and [Callback](https://doc.sensory.com/tnl/7.8/api/inference.md#callback) handle (and on the C string returned by [getString](https://doc.sensory.com/tnl/7.8/api/inference.md#getters)); manage lifetimes with [retain](https://doc.sensory.com/tnl/7.8/api/heap.md#retain) and [release](https://doc.sensory.com/tnl/7.8/api/heap.md#release). ### Java A C function `snsrXxx(SnsrSession s, ...)` becomes a Java method `Session.xxx(...)`: the `Snsr` prefix and the `SnsrSession` first argument are absorbed into the receiver. For example, `snsrSetHandler(s, key, c)` ↔ `s.setHandler(key, c)`. The Java binding does not surface latched errors to callers. Each Java method either completes successfully or throws an exception describing the failure; subsequent method calls on the same session start fresh. Six methods that perform I/O — [Session.load](https://doc.sensory.com/tnl/7.8/api/inference.md#load), [Session.run](https://doc.sensory.com/tnl/7.8/api/inference.md#run), and [Stream.copy](https://doc.sensory.com/tnl/7.8/api/io.md#stream-copy), [Stream.getDelim](https://doc.sensory.com/tnl/7.8/api/io.md#stream-getDelim), [Stream.open](https://doc.sensory.com/tnl/7.8/api/io.md#stream-open), [Stream.read](https://doc.sensory.com/tnl/7.8/api/io.md#stream-read), [Stream.skip](https://doc.sensory.com/tnl/7.8/api/io.md#stream-skip), [Stream.write](https://doc.sensory.com/tnl/7.8/api/io.md#stream-write) — declare `throws java.io.IOException` and so are checked. All other exceptions are unchecked subclasses of `java.lang.RuntimeException`. The mapping from [SnsrRC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) to Java exception class is part of the binding's contract: | Java exception class | Thrown for [SnsrRC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) codes such as | Typical cause | | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | | `java.io.IOException` (checked) | [EOF](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_eof), [STREAM](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end), [NOT_OPEN](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [BUFFER_OVERRUN](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [BUFFER_UNDERRUN](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [DELIM_NOT_FOUND](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [LIBRARY_TOO_OLD](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) | Stream I/O failed or reached end-of-data. Only thrown from the six methods that declare `throws IOException`; identical conditions outside those methods raise `RuntimeException`. | | `java.lang.OutOfMemoryError` | [NO_MEMORY](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [NOT_ENOUGH_SPACE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) | Allocation failed. | | `java.lang.IllegalArgumentException` | [INVALID_ARG](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [INVALID_HANDLE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [INCORRECT_SETTING_TYPE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_incorrect_setting_type), [SETTING_IS_READ_ONLY](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [FORMAT_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [VERSION_MISMATCH](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) | The caller passed a value that is the wrong type, the wrong format, or otherwise unacceptable. | | `java.lang.IndexOutOfBoundsException` | [SETTING_NOT_FOUND](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [SETTING_NOT_AVAILABLE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [VALUE_NOT_SET](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [ARG_OUT_OF_RANGE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [NAME_NOT_UNIQUE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [ITERATION_LIMIT](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) | A lookup by name, index, or port missed; or a numeric / iteration limit was exceeded. | | `java.lang.RuntimeException` | All other non-OK codes — `ERROR`, `NOT_IMPLEMENTED`, `CONFIGURATION_*`, `ELEMENT_*`, `LICENSE_*`, `NO_MODEL`, `NOT_INITIALIZED`, `NOT_SUPPORTED`, `TIMED_OUT`, and so on. | Misconfiguration, license issue, internal API violation, or the catch-all bucket. | Callbacks that need to control the run loop without raising an exception may return any of [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end), [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), [SKIP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_skip), [REPEAT](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_repeat), or [TIMED_OUT](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_timed_out); any other return value from a callback is translated into an exception by the same mapping. Java methods that have no out-parameters in C return the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) instance instead, so callers chain freely: ```java s.load(input).setHandler(KEY, listener).run(); ``` When an exception is thrown, the underlying [SnsrRC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) code remains available on the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) (or [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream)) for the duration of the `catch` block: call [rC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) to read it programmatically, or [errorDetail](https://doc.sensory.com/tnl/7.8/api/inference.md#errordetail) for the human-readable message. (The exception's `getMessage()` is set to the same `errorDetail` text.) This is useful when the exception class alone is too coarse — for example, distinguishing a missing setting from an unsupported one when both surface as `IndexOutOfBoundsException`: ```java try { s.set(KEY, value).run(); } catch (IndexOutOfBoundsException e) { if (s.rC() == SnsrRC.SETTING_NOT_FOUND) { // Treat as a config-file typo, fall back to a default. } else { throw e; } } ``` The handler does not need to do anything to "reset" the session — as above, the next method call on the same session starts fresh. The Java binding uses standard garbage collection. Explicit `retain` / `release` are not exposed in Java and are not needed; [Session.release()](https://doc.sensory.com/tnl/7.8/api/inference.md#session-release) is provided for callers who want to free native resources promptly without waiting for GC, but is otherwise optional. ### Python The [Python][] binding follows the Java naming recipe with [PEP 8][] snake_case: drop the `snsr`/`Snsr` prefix and the session/stream receiver, then convert camelCase to snake_case (`snsrSetInt` → `set_int`, `snsrGetString` → `get_string`). Class methods use the type name instead of a prefix (`snsrStreamFromAudioFile` → `Stream.from_audio_file`). A few names differ on purpose: `snsrSet(s, "+i+…")` → `apply("+i+…")` (not `set_*`), `snsrStreamFromFileName` → `Stream.from_filename`, and stream status uses properties (`s.rc`, `s.error_detail`) instead of `snsrStreamRC` / `snsrStreamErrorDetail`. Look up signatures on the **Python** tabs throughout this reference (setting keys and enums include Python automatically). The [Python][] binding also avoids a latched-error session state. Any call that would return a non-OK [SnsrRC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) in C raises `snsr.Error` instead; read the detail string from `Error.message` (there is no [clearRC](https://doc.sensory.com/tnl/7.8/api/inference.md#clearrc) or session [errorDetail](https://doc.sensory.com/tnl/7.8/api/inference.md#errordetail) on the Python surface). [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) is special: it returns an [RC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) on success (for example when a handler returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop)) and raises `snsr.Error` only for true failures. Callbacks may return [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end), [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), [SKIP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_skip), [REPEAT](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_repeat), or [TIMED_OUT](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_timed_out); any other code from a handler is raised as `snsr.Error`. Install the wheel from the SDK installer, not PyPI — see [Integrate with your build § Python](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-python). | Python | Thrown for [SnsrRC](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) codes such as | Typical cause | | ------ | ------------------------------------- | --------------- | | `snsr.Error` | All non-OK codes (same set as the Java `RuntimeException` and `IOException` rows above, plus stream and configuration failures) | Any API or handler failure; `Error.message` matches C [errorDetail](https://doc.sensory.com/tnl/7.8/api/inference.md#errordetail). | | Normal return from [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) | [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end), … | Handler stopped the loop or the input stream ended without an error. | **Context managers.** `with snsr.Stream(...) as s:` calls [open](https://doc.sensory.com/tnl/7.8/api/io.md#stream-open) on enter and [close](https://doc.sensory.com/tnl/7.8/api/io.md#stream-close) on exit. `with snsr.Session() as s:` does **not** open the session (construction already initializes it); it only [release](https://doc.sensory.com/tnl/7.8/api/heap.md#release)s on exit. That differs from `with open(...)` on a file and from Java, which has no context-manager support. The [Python][] binding does not expose [retain](https://doc.sensory.com/tnl/7.8/api/heap.md#retain) or [release](https://doc.sensory.com/tnl/7.8/api/heap.md#release) to callers. Use `with snsr.Session() as s` or `with snsr.Stream(...) as s` to free native resources promptly; otherwise handles are released when Python finalizes the objects. Session [getters](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) that return a [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) (for example `get_stream` in Python) yield a retained handle (same ownership rules as the C API, documented on [Memory management](https://doc.sensory.com/tnl/7.8/api/heap.md#memory-management)). **Ctrl+C during blocking calls.** Python installs its own `SIGINT` handler, so `^C` while [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) (or another call) is blocked in native code only takes effect after the call returns — the process appears hung. Before a long-running SDK call, restore the default handler: ```python import signal signal.signal(signal.SIGINT, signal.SIG_DFL) ``` Then `^C` terminates the process immediately. Programs that return [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop) from a result handler still exit normally after a spot without keyboard input. ### Kotlin on Android Kotlin Android apps use the **Java binding** unchanged. The same `com.sensory.speech.snsr` `@aar` artifact, the same [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) and [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) classes, and the same [Listener](https://doc.sensory.com/tnl/7.8/api/inference.md#listener) interface — called from Kotlin source. There is no separate Kotlin SDK; the **Java** tabs on [Inference](https://doc.sensory.com/tnl/7.8/api/inference.md#inference) and [I/O](https://doc.sensory.com/tnl/7.8/api/io.md#input-and-output) are the reference for Kotlin callers too. The same notes apply to desktop Kotlin against the JAR coordinates in [Integrate with your build § Java](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-java); only Android-specific items below are platform-specific. A few interop points are worth knowing because the call site looks slightly different from Java: * **Lambdas for [Listener](https://doc.sensory.com/tnl/7.8/api/inference.md#listener).** [Listener](https://doc.sensory.com/tnl/7.8/api/inference.md#listener) is a Java single-abstract-method interface, so Kotlin converts a lambda to it directly: ```kotlin session.setHandler(Snsr.RESULT_EVENT) { ses, _ -> println("Spotted \"${ses.getString(Snsr.RES_TEXT)}\".") SnsrRC.STOP } ``` Kotlin's own `interface` types would require `fun interface` for the same conversion, but Kotlin → Java SAM interop is unconditional. Listener parameters arrive as Kotlin **platform types** (`SnsrSession!`, `String!`) because the Java SDK has no nullability annotations; treat them as non-null. * **Checked `IOException`.** The six methods listed above declare `throws java.io.IOException`. The Kotlin compiler does **not** enforce Java checked exceptions — these methods still throw `IOException` at runtime, but callers receive no compile-time warning. Wrap [Session.load](https://doc.sensory.com/tnl/7.8/api/inference.md#load), [Session.run](https://doc.sensory.com/tnl/7.8/api/inference.md#run), and [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) I/O in `try`/`catch` (or `runCatching`) even though Kotlin lets you omit it. * **`release()` is not `close()`.** [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) and [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) expose `release()`; they do not implement `java.lang.AutoCloseable`. The Kotlin `use { }` helper does **not** apply. Call `release()` explicitly, or write an app-side extension function — there is no SDK-shipped Kotlin extension. * **Threading.** [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) blocks for the lifetime of the recognition session. Use a **dedicated single thread** (e.g. `newSingleThreadContext("snsr")` or a plain `Thread`) — the same worker-thread pattern as the Java Android samples. Do not run [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) on `Dispatchers.IO`; that pool is sized for short blocking I/O, not for an open-ended pull loop. For end-to-end Kotlin code, see the **Kotlin** sub-tab in [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program) § The program. **Also see these related items:** [Android examples](https://doc.sensory.com/tnl/7.8/api/sample/android/index.md#android-examples), [Integrate with your build § Android](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-android). [C]: https://en.wikipedia.org/wiki/C_(programming_language) "C programming language" [dataflow]: https://en.wikipedia.org/wiki/Flow-based_programming "Flow-based programming" [inversion of control]: https://en.wikipedia.org/wiki/Inversion_of_control [Java]: https://en.wikipedia.org/wiki/Java_(programming_language) "Java programming language" [PEP 8]: https://peps.python.org/pep-0008/ "Style Guide for Python Code" [protocol buffers]: https://en.wikipedia.org/wiki/Protocol_Buffers [Python]: https://en.wikipedia.org/wiki/Python_(programming_language) "Python programming language" *[API]: Application Programming Interface *[RAM]: Random Access Memory *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/android/SnsrStreamAudioDeviceAndroid.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/android/SnsrStreamAudioDeviceAndroid/" --- # SnsrStreamAudioDeviceAndroid.java This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) implementation for Android. It provides a [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) adapter for [Android Audio][AudioRecord]. **Also see these related items:** [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider) ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/android/misc/SnsrStreamAudioDeviceAndroid.java_ **SnsrStreamAudioDeviceAndroid.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * Android AudioRecord to read-only SnsrStream adapter. *------------------------------------------------------------------------------ */ package com.sensory.speech.snsr; import java.io.IOException; import android.media.AudioFormat; import android.media.AudioRecord; import android.media.MediaRecorder; import android.media.MediaRecorder.AudioSource; import com.sensory.speech.snsr.SnsrStream; /* * Implements the SnsrStream.Provider interface for live audio. * * Create a new SnsrStream instance with: * SnsrStream a = SnsrStream.fromProvider(new SnsrStreamAudioDeviceAndroid(16000), * SnsrStreamMode.READ); */ class SnsrStreamAudioDeviceAndroid implements SnsrStream.Provider { private static final String TAG = "SnsrStreamAudioDeviceAndroid"; private static final int CHANNELS = AudioFormat.CHANNEL_IN_MONO; private static final int ENCODING = AudioFormat.ENCODING_PCM_16BIT; private int mSource = AudioSource.VOICE_RECOGNITION; /* Use an audio buffer that is at least this long */ private static final int MIN_BUFFER_SIZE_MS = 1000; private AudioRecord mAudio; private int mBufferSize; private int mSampleRate; /** * Static constructor. * @param[in] sampleRate the sample rate. Use 16000. */ public SnsrStreamAudioDeviceAndroid(int source, int sampleRate) { double minBufferSize = (double)sampleRate * MIN_BUFFER_SIZE_MS / 1000; mBufferSize = AudioRecord.getMinBufferSize(sampleRate, CHANNELS, ENCODING); if (mBufferSize < minBufferSize) { mBufferSize = mBufferSize * (int)Math.ceil(minBufferSize / mBufferSize); } mSampleRate = sampleRate; mSource = source; } /** * Static constructor with VOICE_RECOGNITION source. * @param[source] recording source. * @param[in] sampleRate the sample rate. Use 16000. */ public SnsrStreamAudioDeviceAndroid(int sampleRate) { this(AudioSource.VOICE_RECOGNITION, sampleRate); } @Override public long onOpen() throws IOException { mAudio = new AudioRecord(mSource, mSampleRate, CHANNELS, ENCODING, mBufferSize); if (mAudio == null || mAudio.getState() != AudioRecord.STATE_INITIALIZED) { mAudio = null; throw new IOException("Could not initialize audio device at " + mSampleRate + " Hz."); } try { mAudio.startRecording(); } catch (IllegalStateException e) { mAudio = null; throw new IOException(e.toString()); } return OK; } @Override public long onClose() throws IOException { try { mAudio.stop(); } catch (IllegalStateException e) { // ignore } mAudio.release(); mAudio = null; return OK; } @Override public void onRelease() { if (mAudio != null) mAudio.release(); mAudio = null; } @Override public long onRead(byte[] buffer) throws IOException { int read = mAudio.read(buffer, 0, buffer.length); if (Thread.interrupted()) return INTERRUPTED; if (read == AudioRecord.ERROR_BAD_VALUE) return INVALID_ARG; else if (read < 0) return ERROR; return read; } @Override public long onWrite(byte[] buffer) throws IOException { return NOT_IMPLEMENTED; } } ``` [AudioRecord]: https://developer.android.com/reference/android/media/AudioRecord "Android AudioRecord class" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/android/enroll-trigger.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger/" --- # enroll-trigger This example shows how to enroll a user-defined wake word (UDT, trigger, key word spotter). ## Instructions ### Build - Using [Android Studio][as]: - Open _sample/android/enroll-udt/_ as an existing Android Studio project. - Connect your device, or create an emulator instance The sample records audio at 16 kHz, which is not universally supported in the emulator. - Press the Play button to build and run the app. - Using Gradle on the command line: - Ensure that `java -version` reports version 17 or later. - Open a terminal window and change the working directory to the _sample/android/enroll-udt_ subdirectory of the TrulyNatural SDK installation. - Set the `ANDROID_HOME` environment to point to the Android SDK. For example: ```sh export ANDROID_HOME=$HOME/Library/Android/sdk ``` - Connect your device. - Run `#!sh ./gradlew installDebug` or `#! gradlew.bat installDebug` ### Run 1. Open the `EnrollTrigger` app. 2. Pick an enrollment phrase such as "Hello blue genie." 3. Press "ENROLL" and follow the instructions. - Say the phrase when prompted. - Enrollment will continue until three good recordings have been made. If an enrollment does not pass the quality checks the reason for the failure, along with a suggestion on how to correct it, will be shown. 4. Once enrollment is complete, press "TALK" 5. Say the enrollment phrase. 6. When spotted, the log window will show the beginning and end times of the phrase relative to the start of the audio stream, and the speaker verification, [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/android/enroll-udt/app/src/main/java/com/sensory/speech/snsr/demo/enrolltrigger/_ ### Enroll.java This class does UDT enrollment. **Enroll.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ *------------------------------------------------------------------------------ */ package com.sensory.speech.snsr.demo.enrolltrigger; import android.content.Context; import android.media.MediaRecorder.AudioSource; import android.util.Log; import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrRC; import com.sensory.speech.snsr.SnsrSession; import com.sensory.speech.snsr.SnsrStream; import java.io.File; import java.io.IOException; import java.util.Locale; import java.util.Random; @SuppressWarnings({"SameParameterValue", "SameReturnValue"}) class Enroll implements SnsrSession.Listener { private static final String TAG = "Enroll"; private static final String ENROLL_VERSION = "~0.8.0 || 1.0.0"; private static final String TARGET = null; // Set to (e.g.) "pc38" to produce embedded output. private static final Boolean SAVE_ENROLLMENT_AUDIO = false; private static final double MIN_SAMPLES = (16000 * 0.2); private static final String EnrollSubDir = "/enroll/"; private static final String[] PromptContext = { "it is me.", "will it rain tomorrow?", "what is Google trading at?" }; static File getOutDir(Context context) { return new File(context.getFilesDir(), EnrollSubDir); } private static int mEnroll = 0; private final MainActivity mUi; private final String mModelFile, mOutFile, mTriggerPhrase; private final File enrollDir; private Boolean mShowPrompt = true; private int mContextIndex = 0; private SnsrStream mAudio; // saved for use across event handlers private File getDirectory() { return enrollDir; } private void saveEmbeddedModel(SnsrSession task, String streamKey, String fileName) { SnsrStream out = SnsrStream.fromFileName(fileName, "w"); try { out.copy(task.getStream(streamKey)); } catch (IOException e) { Log.e(TAG, e.toString()); } out.close(); out.release(); Log.i(TAG, "Wrote " + streamKey + " to " + fileName); } private void saveEmbeddedModels(SnsrSession s, String target, String filePrefix) { SnsrSession task = new SnsrSession(); try { task.load(s.getStream(Snsr.MODEL_STREAM)); task.setString(Snsr.EMBEDDED_TARGET, target); saveEmbeddedModel(task, Snsr.EMBEDDED_HEADER_STREAM, getPath(filePrefix + "-sch.h")); saveEmbeddedModel(task, Snsr.EMBEDDED_SEARCH_STREAM, getPath(filePrefix + "-sch.bin")); saveEmbeddedModel(task, Snsr.EMBEDDED_ACMODEL_STREAM, getPath(filePrefix + "-net.bin")); } catch (IOException e) { Log.e(TAG, e.toString()); } task.release(); } private String getPath(String fileName) { return new File(getDirectory(), fileName).getAbsolutePath(); } String getOutPath() { return getPath(mOutFile); } private Thread mRecogThread; public Enroll(MainActivity mainActivity, String triggerPhrase, String modelFile, String outFile) { mUi = mainActivity; mTriggerPhrase = triggerPhrase; mModelFile = modelFile; enrollDir = getOutDir(mainActivity); //noinspection ResultOfMethodCallIgnored enrollDir.mkdirs(); mOutFile = outFile; } public synchronized void start() { if (mRecogThread == null) { Log.d(TAG, "Starting enroll thread."); mRecogThread = new Thread(new Runnable() { @Override public void run() { try { doEnroll(); } catch (Exception e) { e.printStackTrace(); } } }); mRecogThread.start(); } } public synchronized void stop() { if (mRecogThread != null && mRecogThread.isAlive()) { Log.d(TAG, "Stopping enroll thread."); mRecogThread.interrupt(); try { mRecogThread.join(); mRecogThread = null; } catch (InterruptedException e) { /* ignore */ } } } private void doEnroll() { // Use the microphone audio source, which typically features automatic gain control. mAudio = SnsrStream.fromAudioDevice(AudioSource.MIC, SnsrStream.DEFAULT_SAMPLE_RATE); // Could also chain these, like so: // SnsrSession session = new SnsrSession().load(mModelFile).require(..).setStream(..).setString(..).setHandler(..) SnsrSession session = new SnsrSession(); try { mUi.log("Loading " + mModelFile); session.load(mModelFile) .require(Snsr.TASK_TYPE, Snsr.ENROLL) .require(Snsr.TASK_VERSION, ENROLL_VERSION); } catch (IOException e) { Log.e(TAG, e.toString()); } session.setStream(Snsr.SOURCE_AUDIO_PCM, mAudio); // the user defined phrase to be enrolled session.setString(Snsr.USER, mTriggerPhrase); // Add in some handlers for important lifecycle events session.setHandler(Snsr.FAIL_EVENT, this); session.setHandler(Snsr.PASS_EVENT, this); session.setHandler(Snsr.PROG_EVENT, this); session.setHandler(Snsr.PAUSE_EVENT, this); session.setHandler(Snsr.RESUME_EVENT, this); session.setHandler(Snsr.DONE_EVENT, this); // You can also define a handler class anonymously inline session.setHandler(Snsr.SAMPLES_EVENT, new SnsrSession.Listener() { @Override public SnsrRC onEvent(SnsrSession snsrSession, String s) { if (mShowPrompt && snsrSession.getDouble(Snsr.RES_SAMPLES) >= MIN_SAMPLES) { promptForPhrase(snsrSession); mShowPrompt = false; } return SnsrRC.OK; } } ); mShowPrompt = true; try { session.run(); // Optional: save enrollment context // session.save(SnsrDataFormat.RUNTIME, xyz); } catch (IOException e) { Log.e(TAG, e.toString()); } // Optional but good practice. finalize() will (eventually) release. session.release(); mAudio.release(); } public SnsrRC onEvent(SnsrSession s, String key) { Log.i(TAG, "SNSR Event: " + key); switch (key) { case Snsr.FAIL_EVENT: return onFail(s); case Snsr.PASS_EVENT: return onPass(s); case Snsr.PROG_EVENT: return onProgress(s); case Snsr.PAUSE_EVENT: return onPause(s); case Snsr.RESUME_EVENT: return onResume(s); case Snsr.DONE_EVENT: return onDone(s); default: Log.e(TAG, "Failed to implement handler for: "+key); return SnsrRC.OK; } } private SnsrRC onFail(SnsrSession s) { Log.e(TAG, "FAILED: " + s.getString(Snsr.RES_REASON)); Log.e(TAG, " FIX: " + s.getString(Snsr.RES_GUIDANCE)); mUi.log("FAILED: " + s.getString(Snsr.RES_REASON)); mUi.log(" FIX: " + s.getString(Snsr.RES_GUIDANCE)); /* Save failed enrollment recording for debugging * can get it with ADB */ if (SAVE_ENROLLMENT_AUDIO) { SnsrStream audio = s.getStream(Snsr.AUDIO_STREAM); if (audio != null) { final String path = getPath(String.format(Locale.US, "fail-%02d.wav", mEnroll++)); SnsrStream out = SnsrStream.fromAudioFile(path, "w"); try { out.copy(audio); } catch (IOException e) { Log.e(TAG, e.toString()); } out.release(); } } return SnsrRC.OK; } private SnsrRC onPass(SnsrSession s) { mUi.log("Audio is good."); /* Save good enrollment recording for debugging * Can be retrieved via ADB */ if (SAVE_ENROLLMENT_AUDIO) { SnsrStream audio = s.getStream(Snsr.AUDIO_STREAM); if (audio != null) { final String path = getPath(String.format(Locale.US, "pass-%02d.wav", mEnroll++)); SnsrStream out = SnsrStream.fromAudioFile(path, "w"); try { out.copy(audio); } catch (IOException e) { Log.e(TAG, e.toString()); } out.release(); } } return SnsrRC.OK; } private SnsrRC onProgress(SnsrSession s) { if (Thread.interrupted()) return SnsrRC.INTERRUPTED; double p = s.getDouble(Snsr.RES_PERCENT_DONE); String progressNotice = String.format(Locale.US, "Adapting: %3.0f%% done.", p); if (p >= 100) progressNotice = "Adapting complete!"; mUi.log(progressNotice); return SnsrRC.OK; } @SuppressWarnings("UnusedParameters") private SnsrRC onPause(SnsrSession s) { mAudio.close(); mUi.log("Checking enrollment quality."); return SnsrRC.OK; } @SuppressWarnings("UnusedParameters") private SnsrRC onResume(SnsrSession s) { try { if (s.getInt(Snsr.ADD_CONTEXT) == 0) mContextIndex = -1; else mContextIndex = (new Random()).nextInt(PromptContext.length); mShowPrompt = true; mAudio.open(); } catch (IOException e) { Log.e(TAG, "Error resuming audio: " + e); return SnsrRC.STREAM; } Log.d(TAG, "open RC: " + mAudio.rC()); return SnsrRC.OK; } private SnsrRC onDone(SnsrSession s) { final String outPath = getOutPath(); SnsrStream out = SnsrStream.fromFileName(outPath, "w"); try { out.copy(s.getStream(Snsr.MODEL_STREAM)); } catch (IOException e) { Log.e(TAG, e.toString()); } out.close(); //noinspection ConstantConditions if (TARGET != null) saveEmbeddedModels(s, TARGET, "embedded-" + TARGET); mUi.notify(UiState.ENROLLED); return SnsrRC.STOP; } private void promptForPhrase(SnsrSession s) { int targetCount = s.getInt(Snsr.ENROLLMENT_TARGET); int currentCount = s.getInt(Snsr.RES_ENROLLMENT_COUNT) + 1; String prompt = "\nSAY: " + mTriggerPhrase; if (mContextIndex >= 0) prompt += " " + PromptContext[mContextIndex]; prompt += " (" + currentCount + " / " + targetCount + ")"; mUi.log(prompt); } } ``` ### PhraseSpot.java This class runs the enrolled wake word recognizer. **PhraseSpot.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ *------------------------------------------------------------------------------ */ package com.sensory.speech.snsr.demo.enrolltrigger; import android.util.Log; import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrRC; import com.sensory.speech.snsr.SnsrSession; import com.sensory.speech.snsr.SnsrStream; import java.io.IOException; import java.util.Locale; @SuppressWarnings({"SameParameterValue", "CanBeFinal", "UnusedReturnValue"}) class PhraseSpot implements SnsrSession.Listener { private final String TAG = "PhraseSpot"; private Thread mRecogThread; private String mModelPath; private double mTimeout = 0.0; private int mSampleRate; private double mSamples; private double mSamplesTimeoutBegin; private MainActivity mUi; PhraseSpot(MainActivity mainActivity, String model, double timeout) { mUi = mainActivity; mModelPath = model; mTimeout = timeout; mSamples = mSamplesTimeoutBegin = 0; } public synchronized void start() { if (mRecogThread == null) { Log.d(TAG, "Starting recognition thread."); mRecogThread = new Thread(new Runnable() { @Override public void run() { doPhraseSpot(); } }); mRecogThread.start(); } } public synchronized void stop() { if (mRecogThread != null && mRecogThread.isAlive()) { Log.d(TAG, "Stopping recognition thread."); mRecogThread.interrupt(); try { mRecogThread.join(); mRecogThread = null; } catch (InterruptedException e) { /* ignore */ } } } private SnsrRC doPhraseSpot() { Log.d(TAG, "Loading from " + mModelPath + "\n"); SnsrStream audio = SnsrStream.fromAudioDevice(); // Could also chain these, like so: // SnsrSession session = new SnsrSession().load(mModelPath).require(..).setStream(..).setHandler(..) SnsrSession session = new SnsrSession(); try { session.load(mModelPath); session.require(Snsr.TASK_TYPE, Snsr.PHRASESPOT); session.setStream(Snsr.SOURCE_AUDIO_PCM, audio); session.setHandler(Snsr.RESULT_EVENT, this); // In case timeout set mSampleRate = session.getInt(Snsr.SAMPLE_RATE); session.setHandler(Snsr.SAMPLES_EVENT, this); session.run(); } catch (IOException e) { /* ignore */ } this.onEvent(session, "stopped"); SnsrRC rc = session.rC(); // Release the underlying C handles immediately, rather than waiting for GC. session.release(); audio.release(); return rc; } @Override public SnsrRC onEvent(SnsrSession s, String key) { if (!Snsr.SAMPLES_EVENT.equals(key)) Log.i(TAG, "SNSR Event: " + key); switch (key) { case Snsr.SAMPLES_EVENT: if (mTimeout == 0) return SnsrRC.OK; mSamples = s.getDouble(Snsr.RES_SAMPLES); double elapsedSamples = mSamples - mSamplesTimeoutBegin; if (elapsedSamples > mTimeout * mSampleRate) { mUi.log("Phrase spot timed out."); return SnsrRC.TIMED_OUT; } else return SnsrRC.OK; case Snsr.RESULT_EVENT: // Start timeout all over again when we hear the trigger mSamplesTimeoutBegin = mSamples; mUi.log(String.format(Locale.US, "\"%s\", score: %.3f", s.getString(Snsr.RES_TEXT), s.getDouble(Snsr.RES_SV_SCORE))); // Try changing this to Snsr.PHONE_LIST s.forEach(Snsr.WORD_LIST, new SnsrSession.Listener() { @Override public SnsrRC onEvent(SnsrSession s, String key) { mUi.log(String.format(Locale.US, " [%4.0f, %4.0f] %s\n", s.getDouble(Snsr.RES_BEGIN_MS), s.getDouble(Snsr.RES_END_MS), s.getString(Snsr.RES_TEXT))); return SnsrRC.OK; } }); return SnsrRC.OK; case "stopped": mUi.notify(UiState.BEFORE_ENROLL); return SnsrRC.OK; default: Log.e(TAG, "Failed to implement handler for: "+key); return SnsrRC.OK; } } } ``` [as]: https://developer.android.com/studio/index.html "Android Studio" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/android/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/android/" --- # Android examples The Android sample programs and code snippets are available in _sample/android/_ in the TrulyNatural installation directory. See _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/android/_ The samples are written in Java; the same `@aar` and APIs work unchanged from [Kotlin][] (see [API overview § Kotlin on Android](https://doc.sensory.com/tnl/7.8/api/overview.md#kotlin-on-android) for interop notes). New to the Session API? Start with [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program) for the Android wake-word flow, then explore the samples below. ## Examples [enroll-trigger](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#enroll-trigger) - UDT enrollment and phrase spotting. [snsr-debug](https://doc.sensory.com/tnl/7.8/api/sample/android/snsr-debug.md#snsr-debug) - Logs recognizer audio and event timing information using the [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-debug) template. [SnsrStreamAudioDeviceAndroid.java](https://doc.sensory.com/tnl/7.8/api/sample/android/SnsrStreamAudioDeviceAndroid.md#snsrstreamaudiodeviceandroidjava) - Source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) implementation for [Android Audio][AudioRecord]. [AudioRecord]: https://developer.android.com/reference/android/media/AudioRecord "Android AudioRecord class" [Kotlin]: https://kotlinlang.org/ "Kotlin programming language" *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/android/snsr-debug.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/android/snsr-debug/" --- # snsr-debug This sample shows how to log recognizer audio and event timing debug information using the [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-debug) template. ## Instructions ### Build - Using [Android Studio][as] - Open _sample/android/snsr-debug/_ as an existing Android Studio project. - Connect your device, or create an emulator instance. The sample records audio at 16 kHz, which is not universally supported in the emulator. - Press the Play button to build and run the app. - Using Gradle on the command line: - Ensure that `java -version` reports version 17 or later. - Open a terminal window and change the working directory to the _sample/android/snsr-debug_ subdirectory of the TrulyNatural SDK installation. - Set the `ANDROID_HOME` environment to point to the Android SDK. For example: ```sh export ANDROID_HOME=$HOME/Library/Android/sdk ``` - Connect your device. - Run `#!sh ./gradlew installDebug` or `#! gradlew.bat installDebug` ### Run 1. Run the app on you device - Open the `SnsrDebug` app. - Select one of the Recognition options: - Wakeword - Wakeword+Commands - _(STT only)_ Speech-To-Text - _(STT only)_ Wakeword+Speech-To-Text - Check "Enable Debugging". - Press "TALK", follow instructions. - Press "STOP" when you're done. 2. Copy the `snsrlog` files from the device to the host. **macOS and Linux** ```sh adb -d shell "run-as com.sensory.speech.snsr.demo.snsrdebug \ tar -C /data/user/0/com.sensory.speech.snsr.demo.snsrdebug/files -cf - logs" | tar xvf - ``` **Windows** ```sh adb -d shell "run-as com.sensory.speech.snsr.demo.snsrdebug \ tar -C /data/user/0/com.sensory.speech.snsr.demo.snsrdebug/files -cf - logs" > logs.tar ``` You can use [7-zip][] to extract the `tar` archive: ```sh 7za -y -ttar x logs.tar ``` 3. Extract text, audio and the spotter model with [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split). The number embedded in each `snsrlog` filename the time when the data capture started, in seconds since the [epoch][]. ```sh snsr-log-split -v logs/SnsrDebug-*.snsrlog ``` 4. Check audio quality with [audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check) on each of the extracted recordings. ```sh # Check the app for the file basename. audio-check -v SnsrDebug-1757808596.wav ``` 5. To delete old data logs from your device: ```sh adb -d shell "run-as com.sensory.speech.snsr.demo.snsrdebug \ rm -rf /data/user/0/com.sensory.speech.snsr.demo.snsrdebug/files" ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/android/snsr-debug/app/src/main/java/com/sensory/speech/snsr/demo/snsrdebug/_ ### PhraseSpot.java This class runs the selected recognizer (wake word, wake word followed by a command set, STT or wake word followed by STT) and optionally captures audio and event timing information. The [audio processing mode](https://doc.sensory.com/tnl/7.8/api/overview.md#processing-modes) defaults to _push_. Change this to _pull mode_ by modifying the `#!java RUNMODE` variable: ```java private final RunMode RUNMODE = RunMode.PULL; // set to PUSH or PULL ``` **PhraseSpot.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ *------------------------------------------------------------------------------ */ package com.sensory.speech.snsr.demo.snsrdebug; import android.os.Handler; import android.os.HandlerThread; import android.os.Message; import android.util.Log; import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrDataFormat; import com.sensory.speech.snsr.SnsrRC; import com.sensory.speech.snsr.SnsrSession; import com.sensory.speech.snsr.SnsrStream; import java.io.File; import java.io.IOException; import java.util.Locale; @SuppressWarnings({"SameParameterValue", "CanBeFinal", "UnusedReturnValue"}) class PhraseSpot implements SnsrSession.Listener { private final String TAG = "PhraseSpot"; private final Boolean VERBOSE = false; // set to true for additional event callbacks private final RunMode RUNMODE = RunMode.PUSH; // set to PUSH or PULL private final int BLOCKSIZE = 480; // size in bytes of a 15 mS audio block captured at 16 KHz // add for push mode HandlerThread private static final int MSG_RESET = 1; private static final int MSG_PUSH = 2; private static final int MSG_STOP = 3; private Thread mRecogThread; private Handler mPushHandler; private final RecogMode mRecogMode; private String mLogPath; private final double mTimeout; private int mSampleRate; private double mSamples; private double mSamplesTimeoutBegin; private final MainActivity mUi; private boolean mDebugging; private volatile boolean mRunning = false, mStopping = false; private HandlerThread mPushHandlerThread; PhraseSpot(MainActivity mainActivity, RecogMode recogMode, double timeout) { mUi = mainActivity; mRecogMode = recogMode; mTimeout = timeout; mSamples = mSamplesTimeoutBegin = 0; mLogPath = null; mDebugging = false; } public void enableDebugging(String logPath) { mDebugging = true; mLogPath = logPath; } public synchronized void start() { if (mRecogThread == null) { mRunning = true; Log.d(TAG, "Starting recognition thread."); mRecogThread = new Thread(new Runnable() { @Override public void run() { doPhraseSpot(); } }); mRecogThread.start(); } } public synchronized void stop() { if (mRecogThread != null && mRecogThread.isAlive()) { Log.d(TAG, "Stopping recognition thread."); mRunning = false; try { mRecogThread.join(); mRecogThread = null; } catch (InterruptedException ignored) {} } } private SnsrRC doPhraseSpot() { SnsrStream audio = SnsrStream.fromAudioDevice(); SnsrSession session = new SnsrSession(); String prompt; try { if (mRecogMode == RecogMode.WW_ONLY) { session.load(assetToString(BuildConfig.TRIGGER_MODEL)); prompt = "Say 'Voice Genie'"; } else if (mRecogMode == RecogMode.WW_CMDS) { session.load(assetToString(BuildConfig.SEQUENTIAL_TEMPLATE)); session.setStream(Snsr.SLOT_0, SnsrStream.fromFileName(assetToString(BuildConfig.TRIGGER_MODEL),"r")); session.setStream(Snsr.SLOT_1, SnsrStream.fromFileName(assetToString(BuildConfig.COMMAND_MODEL),"r")); prompt = "Say 'Voice Genie' followed by one of: \n'Play music'\n'Pause music'\n'Stop music'\n'Next song'\n'Previous song'"; } else if (BuildConfig.SDK_TYPE.equals("tnl-stt") && ((mRecogMode == RecogMode.STT_ONLY || mRecogMode == RecogMode.WW_STT))) { session.load(assetToString(BuildConfig.OPT_SPOT_VAD_LVCSR_TEMPLATE)); session.setStream(Snsr.PHRASESPOT, SnsrStream.fromFileName(assetToString(BuildConfig.TRIGGER_MODEL),"r")); session.setStream(Snsr.LVCSR, SnsrStream.fromFileName(assetToString(BuildConfig.STT_MODEL),"r")); session.setInt(Snsr.INCLUDE_LEADING_SILENCE, 1); if (mRecogMode == RecogMode.WW_STT) { session.setString(Snsr.SLOT, Snsr.SLOT_0); prompt = "Say 'Voice Genie' followed by an automotive command (eg 'Turn on the radio' or 'open the rear hatch')"; } else { session.setString(Snsr.SLOT, Snsr.SLOT_1); prompt = "Say an automotive command (eg 'set the AC to 72' or 'roll down the driver's window')"; } } else { throw new Exception("Unknown recognition mode"); } if (mDebugging) { // Create debug session SnsrSession debug = new SnsrSession(); debug.load(assetToString(BuildConfig.DEBUG_TEMPLATE)); debug.setString(Snsr.DEBUG_LOG_FILE, mLogPath); debug.setInt(Snsr.INCLUDE_MODEL, 0); // Load existing session into the debug model SnsrStream modelData = SnsrStream.fromBuffer(1<<20, 1<<30); session.save(SnsrDataFormat.CONFIG, modelData); session.release(); debug.setStream(Snsr.SLOT_0, modelData); modelData.release(); // Replace session with the the same model wrapped in the tpl-spot-debug template session = debug; // Show the debug log file name in the UI File file = new File(mLogPath); mUi.logToConsole("\nAudio will be logged to " + file.getName()); } // Main result handler session.setHandler(Snsr.RESULT_EVENT, this); // Get sample rate in case timeout was set mSampleRate = session.getInt(Snsr.SAMPLE_RATE); session.setHandler(Snsr.SAMPLES_EVENT, this); // These events exist only for a subset of the models, we therefore // ignore any errors while attempting to set them. try { if (mDebugging) { session.setHandler(Snsr.SLOT_0 + Snsr.SLOT_0 + Snsr.RESULT_EVENT, this); } else { session.setHandler(Snsr.SLOT_0 + Snsr.RESULT_EVENT, this); } } catch (Exception ignored) {} try { session.setHandler(Snsr.NLU_INTENT_EVENT, this); } catch (Exception ignored) {} if (VERBOSE) { try { session.setHandler(Snsr.LISTEN_BEGIN_EVENT, this); } catch (Exception ignored) {} try { session.setHandler(Snsr.LISTEN_END_EVENT, this); } catch (Exception ignored) {} try { session.setHandler(Snsr.BEGIN_EVENT, this); } catch (Exception ignored) {} try { session.setHandler(Snsr.END_EVENT, this); } catch (Exception ignored) {} } session.reset(); // clear error codes reported by rC() mUi.logToConsole("\n" + prompt +"\n"); if (RUNMODE == RunMode.PULL) { // pull mode - the session will read audio and process it internally session.setStream(Snsr.SOURCE_AUDIO_PCM, audio); session.run(); } else { // push mode - the application code reads the audio and passes it to the session startHandlerThread(session); do { byte[] buffer; buffer = new byte[BLOCKSIZE]; long bytesRead = audio.read(buffer); // since audio.read blocks execution while it waits for an audio block to // become available, move session.push into its own Handler mPushHandler.sendMessage(Message.obtain(null, MSG_PUSH, buffer)); } while (session.rC() == SnsrRC.OK && audio.rC() == SnsrRC.OK); mPushHandler.sendMessage(Message.obtain(null, MSG_STOP)); try { mPushHandlerThread.join(); mPushHandlerThread = null; } catch (InterruptedException ignored) {} mPushHandler = null; } } catch (IOException e) { Log.e(TAG, "Error loading and starting model", e); mUi.logToConsole("ERROR: " + e.getMessage()); } catch (Exception e) { Log.e(TAG, "Initialization error" + e); mUi.logToConsole("ERROR: " + e.getMessage()); } this.onEvent(session, "stopped"); SnsrRC rc = session.rC(); // Release the underlying native handles immediately, rather than waiting for GC. session.release(); audio.release(); return rc; } // Format a BuildConfig model filename to an "assets/models" string private String assetToString(String assetName) { return new File("assets/models", assetName.replace(':', '-')).toString(); } // in push mode, run the SnsrSession in its own HandlerThread. private void startHandlerThread(SnsrSession session) { mPushHandlerThread = new HandlerThread("pushMode"); mPushHandlerThread.start(); mPushHandler = new Handler(mPushHandlerThread.getLooper()) { @Override public void handleMessage(Message msg) { switch (msg.what) { case MSG_PUSH: byte[] data = (byte[]) msg.obj; //Log.d(TAG, "stt.push called with " + data.length + " bytes"); session.push(Snsr.SOURCE_AUDIO_PCM, data); break; case MSG_STOP: session.stop(); mPushHandlerThread.quit(); break; } } }; } @Override public SnsrRC onEvent(SnsrSession s, String key) { if (!Snsr.SAMPLES_EVENT.equals(key)) Log.i(TAG, "SNSR Event: " + key); switch (key) { case Snsr.SAMPLES_EVENT: // used to implement timeout after 30 seconds of no speech if (!mRunning) return SnsrRC.STOP; if (mTimeout == 0) return SnsrRC.OK; mSamples = s.getDouble(Snsr.RES_SAMPLES); double elapsedSamples = mSamples - mSamplesTimeoutBegin; // Log.d(TAG, "elapsedSamples = " + elapsedSamples); if (elapsedSamples > mTimeout * mSampleRate) { if (!mStopping) mUi.logToConsole("Phrase spot timed out.\n"); mStopping = true; return SnsrRC.TIMED_OUT; } else return SnsrRC.OK; case Snsr.SLOT_0 + Snsr.RESULT_EVENT: // callback for a wakeword result in WW_CMDS mode mUi.logToConsole(String.format(Locale.US, "Wakeword: '%s'", s.getString(Snsr.SLOT_0 + Snsr.RES_TEXT))); return SnsrRC.OK; case Snsr.SLOT_0 + Snsr.SLOT_0 + Snsr.RESULT_EVENT: // callback for a wakeword result in WW_STT mode mUi.logToConsole(String.format(Locale.US, "Wakeword: '%s'", s.getString(Snsr.SLOT_0 + Snsr.SLOT_0 + Snsr.RES_TEXT))); return SnsrRC.OK; case Snsr.RESULT_EVENT: // Reset timeout counter after a result mSamplesTimeoutBegin = mSamples; mUi.logToConsole(String.format(Locale.US, "Result: '%s'\n", s.getString(Snsr.RES_TEXT))); if (VERBOSE) { // print individual words in the result // Try changing this to Snsr.PHONE_LIST for phonemes s.forEach(Snsr.WORD_LIST, new SnsrSession.Listener() { @Override public SnsrRC onEvent(SnsrSession s, String key) { mUi.logToConsole(String.format(Locale.US, " [%4.0f, %4.0f] %s", s.getDouble(Snsr.RES_BEGIN_MS), s.getDouble(Snsr.RES_END_MS), s.getString(Snsr.RES_TEXT))); return SnsrRC.OK; } }); } return SnsrRC.OK; case Snsr.LISTEN_BEGIN_EVENT: case Snsr.LISTEN_END_EVENT: case Snsr.BEGIN_EVENT: case Snsr.END_EVENT: // misc sequential and VAD events (VAD requires TrulyNatural SDK) mUi.logToConsole(String.format(Locale.US, "Event: '%s'", key)); return SnsrRC.OK; case Snsr.NLU_INTENT_EVENT: // NLU intents for STT_ONLY and WW_STT recogModes (Requires TrulyNatural SDK) mUi.logToConsole(String.format(Locale.US, "Intent: '%s' = '%s'", s.getString(Snsr.RES_NLU_INTENT_NAME), s.getString(Snsr.RES_NLU_INTENT_VALUE))); s.forEach(Snsr.NLU_ENTITY_LIST, new SnsrSession.Listener() { @Override public SnsrRC onEvent(SnsrSession s, String key) { mUi.logToConsole(String.format(Locale.US, "Entity: '%s' = '%s'", s.getString(Snsr.RES_NLU_ENTITY_NAME), s.getString(Snsr.RES_NLU_ENTITY_VALUE))); return SnsrRC.OK; } }); return SnsrRC.OK; case "stopped": // custom event callback to reset screen buttons and checkboxes mUi.notify(UiState.NOT_TALKING); return SnsrRC.OK; default: Log.e(TAG, "Failed to implement handler for: " + key); return SnsrRC.OK; } } } ``` [7-zip]: http://www.7-zip.org/ [as]: https://developer.android.com/studio/index.html "Android Studio" [epoch]: https://en.wikipedia.org/wiki/Unix_time "Unix time" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/alsa-stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/alsa-stream/" --- # alsa-stream.c This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for [ALSA][], used for live audio capture on Linux. ## Instructions **Also see these related items:** See [live-spot-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot-stream.md#live-spot-streamc). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/alsa-stream.{c,h}_ **alsa-stream.h:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK custom stream header. See alsa-stream.c. *------------------------------------------------------------------------------ */ typedef enum { STREAM_LATENCY_LOW, /* low latency, high CPU overhead */ STREAM_LATENCY_HIGH, /* higher latency, with lower CPU overhead */ } StreamLatency; SnsrStream streamFromALSA(const char *name, unsigned int rate, SnsrStreamMode mode, StreamLatency latency); ``` { data-search-exclude } **alsa-stream.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK keyword spotting minimal example using a custom stream. *------------------------------------------------------------------------------ * SnsrStream ALSA (Linux audio) provider implementation. * Currently capture-only. *------------------------------------------------------------------------------ */ #include #include #include #include #include #include #include #include "alsa-stream.h" /* 15 ms at 16 kHz */ #define PERIOD_SIZE_LOW_LATENCY 240 /* 200 ms at 16 kHz */ #define PERIOD_SIZE_HIGH_LATENCY 3200 /* Minimum number of periods the buffer should include */ #define MIN_PERIOD_COUNT 5 /* Buffer size in ms */ #define MIN_BUFFER_MS 500 typedef struct { snd_pcm_t *in; const char *initErrorMsg; /* NULL if initialization was successful */ } ProviderData; /* This wrapper macro is used to simplify ALSA library error checking. * Commands are not executed if an error condition exists. */ #define AE(cmd)\ if (snsrStreamRC(b) == SNSR_RC_OK) {\ int r = snd_pcm_ ## cmd;\ if (r < 0) {\ snsrStream_setDetail(b, "ALSA error: %s", snd_strerror(r));\ snsrStream_setRC(b, SNSR_RC_ERROR);\ }\ } static SnsrRC streamOpen(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); if (!d->in) { if (d->initErrorMsg) snsrStream_setDetail(b, "%s", d->initErrorMsg); else snsrStream_setDetail(b, "Could not open ALSA device for capture."); return SNSR_RC_NOT_FOUND; } AE( prepare(d->in) ); return snsrStreamRC(b); } static SnsrRC streamClose(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); AE( drop(d->in) ); return snsrStreamRC(b); } static void streamRelease(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); AE( close(d->in) ); free((void *)d->initErrorMsg); free(d); } static size_t streamRead(SnsrStream b, void *buffer, size_t size) { ProviderData *d = (ProviderData *)snsrStream_getData(b); snd_pcm_uframes_t read, total = 0, want = size / sizeof(short); short *sbuff = buffer; if (snd_pcm_state(d->in) == SND_PCM_STATE_XRUN) { snsrStream_setRC(b, SNSR_RC_BUFFER_OVERRUN); return 0; } do { read = snd_pcm_readi(d->in, sbuff + total, want - total); if ((int)read < 0) read = snd_pcm_recover(d->in, read, 0); if ((int)read < 0) { snsrStream_setDetail(b, "ALSA read error: %s", snd_strerror((int)read)); snsrStream_setRC(b, SNSR_RC_ERROR); return 0; } total += read; } while (total < want); return total * sizeof(short); } static SnsrStream_Vmt ProviderDef = { "ALSA", &streamOpen, &streamClose, &streamRelease, &streamRead, NULL }; SnsrStream streamFromALSA(const char *name, unsigned int rate, SnsrStreamMode mode, StreamLatency latency) { SnsrStream b; ProviderData *d = (ProviderData *)malloc(sizeof(*d)); snd_pcm_t *h = NULL; snd_pcm_hw_params_t *p = NULL; int dir = 0; snd_pcm_uframes_t frames; if (!d) return NULL; memset(d, 0, sizeof(*d)); b = snsrStream_alloc(&ProviderDef, d, 1, 0); if (!b) { free(d); return NULL; } if (mode != SNSR_ST_MODE_READ) { snsrStream_setRC(b, SNSR_RC_INVALID_MODE); return b; } AE( open(&h, name, SND_PCM_STREAM_CAPTURE, 0) ); AE( hw_params_malloc(&p) ); AE( hw_params_any(h, p) ); AE( hw_params_set_access(h, p, SND_PCM_ACCESS_RW_INTERLEAVED) ); AE( hw_params_set_format(h, p, SND_PCM_FORMAT_S16_LE) ); AE( hw_params_set_channels(h, p, 1) ); AE( hw_params_set_rate(h, p, (unsigned)rate, 0) ); switch (latency) { case STREAM_LATENCY_LOW: frames = PERIOD_SIZE_LOW_LATENCY; break; case STREAM_LATENCY_HIGH: frames = PERIOD_SIZE_HIGH_LATENCY; break; } AE( hw_params_set_period_size_near(h, p, &frames, &dir) ); AE( hw_params_get_period_size(p, &frames, &dir) ); frames = MIN_PERIOD_COUNT * frames; if (frames < MIN_BUFFER_MS * rate / 1000.0 ) frames *= (int)(MIN_BUFFER_MS * rate / 1000.0 / frames + 0.5); AE( hw_params_set_buffer_size_near(h, p, &frames) ); AE( hw_params(h, p) ); snd_pcm_hw_params_free(p); if (snsrStreamRC(b) == SNSR_RC_OK) d->in = h; else if (h) snd_pcm_close(h); if (!d->in) d->initErrorMsg = strdup(snsrStreamErrorDetail(b)); return b; } ``` [ALSA]: http://www.alsa-project.org/main/index.php/ALSA_Library_API "Advanced Linux Sound Architecture" *[ALSA]: Advanced Linux Sound Architecture *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/aqs-stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/aqs-stream/" --- # aqs-stream.c This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for [Audio Queue Services][], used for live audio capture on macOS and iOS. ## Instructions See [live-spot-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot-stream.md#live-spot-streamc). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/aqs-stream.{c,h}_ **aqs-stream.h:** ```c /* Sensory Confidential * Copyright (C)2019-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK custom stream header. See aqs-stream.c. *------------------------------------------------------------------------------ */ typedef enum { STREAM_LATENCY_LOW, /* low latency, high CPU overhead */ STREAM_LATENCY_HIGH, /* higher latency, with lower CPU overhead */ } StreamLatency; SnsrStream streamFromAQS(unsigned int rate, SnsrStreamMode mode, StreamLatency latency); ``` { data-search-exclude } **aqs-stream.c:** ```c /* Sensory Confidential * Copyright (C)2018-2026 Sensory, Inc. https://sensory.com/ * *------------------------------------------------------------------------------ * SnsrStream provider for Audio Queue Services on Darwin. * Currently capture-only. *------------------------------------------------------------------------------ */ #include #include #include #include #include #include #include #include "aqs-stream.h" /* Initial size of the circular capture buffer. 500 ms at 16kHz */ #define CAPTURE_MINSIZE 16000 /* Maximum size of the circular capture buffer. 10 s at 16kHz */ #define CAPTURE_MAXSIZE 320000 #define PERIOD_LOW_LATENCY 0.015 #define PERIOD_HIGH_LATENCY 0.200 #define BUFFER_SIZE_SECONDS 2.0 /* 10 ms at 16 kHz */ #define MIN_BUFFER_SIZE 320 /* 5 s at 16 kHz */ #define MAX_BUFFER_SIZE 160000 #define MIN_BUFFER_COUNT 4 typedef struct { SnsrStream capture; /* Captured audio buffer */ const char *initErrorMsg; /* NULL if initialization ok */ AudioStreamBasicDescription dataFormat; /* recording format description */ AudioQueueRef queue; /* OS-level recording queue */ AudioQueueBufferRef *buffer; /* queue buffers */ pthread_mutex_t lock; /* protects capture stream */ pthread_cond_t notEmpty; /* signals capture stream */ size_t bufferCount; /* number of buffers in *buffer */ UInt32 bufferByteSize; /* size of one queue buffer */ unsigned isRunning:1; /* 0 to wind down recording */ } ProviderData; static SnsrRC ossErrorMessage(SnsrStream b, OSStatus s) { const char *msg; switch (s) { case kAudioServicesNoError: msg = NULL; case kAudioServicesUnsupportedPropertyError: msg = "The property is not supported."; break; case kAudioServicesBadPropertySizeError: msg = "The size of the property data was not correct."; break; case kAudioServicesBadSpecifierSizeError: msg = "The size of the specifier data was not correct."; break; case kAudioServicesSystemSoundClientTimedOutError: msg = "System sound client message timed out."; break; case kAudioServicesSystemSoundUnspecifiedError: default: snsrStream_setDetail(b, "An unspecified error occurred, code %i.", (int)s); return SNSR_RC_ERROR; } snsrStream_setDetail(b, "%s", msg); return SNSR_RC_ERROR; } static void audioCallback(void *privateData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer, const AudioTimeStamp *inStartTime, UInt32 inNumPackets, const AudioStreamPacketDescription *inPacketDesc) { ProviderData *d = (ProviderData *)privateData; size_t written; if (!inNumPackets && d->dataFormat.mBytesPerPacket) inNumPackets = inBuffer->mAudioDataByteSize / d->dataFormat.mBytesPerPacket; if (inNumPackets) { pthread_mutex_lock(&d->lock); written = snsrStreamWrite(d->capture, inBuffer->mAudioData, d->dataFormat.mBytesPerPacket, inNumPackets); if (written < inNumPackets) snsrStream_setRC(d->capture, SNSR_RC_BUFFER_OVERRUN); pthread_cond_broadcast(&d->notEmpty); pthread_mutex_unlock(&d->lock); } if (d->isRunning) AudioQueueEnqueueBuffer(d->queue, inBuffer, 0, NULL); } static OSStatus audioInit(ProviderData *d, double sampleRate, double chunkSizeSeconds, double totalSizeSeconds) { AudioStreamBasicDescription *f = &d->dataFormat; OSStatus s; int i; /* Recording format: 16-bit LPCM at 16 kHz */ memset(f, 0, sizeof(*f)); f->mFormatID = kAudioFormatLinearPCM; f->mSampleRate = sampleRate; f->mChannelsPerFrame = 1; f->mBitsPerChannel = 16; f->mBytesPerPacket = f->mBytesPerFrame = f->mChannelsPerFrame * sizeof(SInt16); f->mFramesPerPacket = 1; f->mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked; /* Create recording queue */ s = AudioQueueNewInput(f, audioCallback, d, NULL, kCFRunLoopCommonModes, 0, &d->queue); if (s != kAudioServicesNoError) return s; /* Derive appropriate audio queue buffer size */ d->bufferByteSize = (UInt32)(f->mSampleRate * f->mBytesPerPacket * chunkSizeSeconds); if (d->bufferByteSize > MAX_BUFFER_SIZE) d->bufferByteSize = MAX_BUFFER_SIZE; if (d->bufferByteSize < MIN_BUFFER_SIZE) d->bufferByteSize = MIN_BUFFER_SIZE; d->bufferCount = ceil(totalSizeSeconds / chunkSizeSeconds); if (d->bufferCount < MIN_BUFFER_COUNT) d->bufferCount = MIN_BUFFER_COUNT; /* Allocate audio buffers */ d->buffer = malloc(sizeof(*d->buffer) * d->bufferCount); for (i = 0; i < d->bufferCount && s == kAudioServicesNoError; i++) s = AudioQueueAllocateBuffer(d->queue, d->bufferByteSize, d->buffer + i); return s; } /*------------------------------------------------------------------------------ */ static SnsrRC streamOpen(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); OSStatus s; int i; if (d->initErrorMsg) { snsrStream_setDetail(b, "%s", d->initErrorMsg); return SNSR_RC_NOT_FOUND; } snsrStreamOpen(d->capture); s = kAudioServicesNoError; for (i = 0; i < d->bufferCount && s == kAudioServicesNoError; i++) s = AudioQueueEnqueueBuffer(d->queue, d->buffer[i], 0, NULL); if (s == kAudioServicesNoError) s = AudioQueueStart(d->queue, NULL); if (s != kAudioServicesNoError) return ossErrorMessage(b, s); d->isRunning = 1; return snsrStreamRC(b); } static SnsrRC streamClose(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); SnsrRC r; d->isRunning = 0; AudioQueueStop(d->queue, true); /* Flush the capture buffer */ r = snsrStreamRC(d->capture); snsrStreamSkip(d->capture, 1, CAPTURE_MAXSIZE); snsrStream_setRC(d->capture, SNSR_RC_OK); return r == SNSR_RC_OK? snsrStreamRC(b): r; } static void streamRelease(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); snsrRelease(d->capture); AudioQueueDispose(d->queue, true); pthread_mutex_destroy(&d->lock); pthread_cond_destroy(&d->notEmpty); free(d->buffer); free(d); } static size_t streamRead(SnsrStream b, void *buffer, size_t size) { SnsrRC r; ProviderData *d = (ProviderData *)snsrStream_getData(b); size_t read = 0; pthread_mutex_lock(&d->lock); if (snsrStreamRC(d->capture) != SNSR_RC_OK) { snsrStream_setRC(b, snsrStreamRC(d->capture)); } else { do { read += snsrStreamRead(d->capture, (char *)buffer + read, 1, size - read); r = snsrStreamRC(d->capture); } while ((r == SNSR_RC_OK || r == SNSR_RC_EOF) && read < size && !pthread_cond_wait(&d->notEmpty, &d->lock)); if (r != SNSR_RC_OK) { snsrStream_setRC(b, r); snsrStream_setDetail(b, "%s", snsrStreamErrorDetail(d->capture)); } else if (read < size) { snsrStream_setRC(b, SNSR_RC_EOF); } } pthread_mutex_unlock(&d->lock); return read; } static SnsrStream_Vmt ProviderDef = { "Darwin Audio Queue Services audio capture", &streamOpen, &streamClose, &streamRelease, &streamRead, NULL }; SnsrStream streamFromAQS(unsigned int rate, SnsrStreamMode mode, StreamLatency latency) { SnsrStream b; ProviderData *d = (ProviderData *)malloc(sizeof(*d)); OSStatus s; int r; if (!d) return NULL; memset(d, 0, sizeof(*d)); b = snsrStream_alloc(&ProviderDef, d, 1, 0); if (!b) { free(d); return NULL; } do { d->capture = snsrStreamFromBuffer(CAPTURE_MINSIZE, CAPTURE_MAXSIZE); if (!d->capture) { snsrStream_setRC(b, SNSR_RC_NO_MEMORY); break; } snsrRetain(d->capture); if (mode != SNSR_ST_MODE_READ) { snsrStream_setRC(b, SNSR_RC_INVALID_MODE); break; } /* Signalling and mutexes */ r = pthread_mutex_init(&d->lock, NULL); if (!r) r = pthread_cond_init(&d->notEmpty, NULL); if (r) { snsrStream_setRC(b, SNSR_RC_ERROR); snsrStream_setDetail(b, "%s", strerror(r)); break; } /* Allocate buffers */ s = audioInit(d, rate, latency == STREAM_LATENCY_LOW? PERIOD_LOW_LATENCY: PERIOD_HIGH_LATENCY, BUFFER_SIZE_SECONDS); if (s != kAudioServicesNoError) { snsrStream_setRC(b, ossErrorMessage(b, s)); break; } } while (0); if (snsrStreamRC(b) != SNSR_RC_OK) d->initErrorMsg = strdup(snsrStreamErrorDetail(b)); return b; } ``` [Audio Queue Services]: https://developer.apple.com/documentation/audiotoolbox/audio-queue-services "Audio Toolbox, Core Audio" *[API]: Application Programming Interface *[AQS]: Audio Queue Services, Apple's audio capture API on Darwin / macOS *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/data-stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/data-stream/" --- # data-stream.c This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for memory data, similar to [fromMemory](https://doc.sensory.com/tnl/7.8/api/io.md#frommemory). It's used in the [spot-data-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data-stream.md#spot-data-streamc) example. For a Python [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider) example using a memory-backed audio source, see [custom_stream.py](https://doc.sensory.com/tnl/7.8/api/sample/python/custom_stream.md#custom_streampy). ## Instructions See [spot-data-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data-stream.md#spot-data-streamc). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/data-stream.{c,h}_ **data-stream.h:** ```c /* Sensory Confidential * Copyright (C)2018-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK example of a custom stream, see data-stream.c. *------------------------------------------------------------------------------ */ SnsrStream streamFromData(void *data, size_t dataSize, SnsrStreamMode mode); ``` { data-search-exclude } **data-stream.c:** ```c /* Sensory Confidential * Copyright (C)2018-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK example of a custom stream, such as an audio stream * to get input from a custom audio driver (in a RTOS for example). * Similar streams are in samples wmme-stream.c and alsa-stream.c. * * NOTE: Normally it's best to use snsrStreamFromMemory(..) for a * stream of data, although this example would also work. *----------------------------------------------------------------------------- */ #include #include #include typedef struct { char *data; size_t dataSize; size_t index; } ProviderData; static SnsrRC streamOpen(SnsrStream stream) { ProviderData *instanceData = snsrStream_getData(stream); if (!instanceData->data) return SNSR_RC_ERROR; instanceData->index = 0; return SNSR_RC_OK; } static SnsrRC streamClose(SnsrStream stream) { ProviderData *instanceData = snsrStream_getData(stream); instanceData->index = 0; return SNSR_RC_OK; } static void streamRelease(SnsrStream stream) { ProviderData *instanceData = snsrStream_getData(stream); free(instanceData); } static size_t streamRead(SnsrStream stream, void *buffer, size_t readSize) { /* NOTE: For a live audio stream, if there is no data available, * this call should block until there is more data available. */ ProviderData *d = snsrStream_getData(stream); size_t available, read; read = readSize; available = d->dataSize - d->index; if (read > available) { read = available; /* Session will end with SNSR_RC_STREAM_END */ snsrStream_setRC(stream, SNSR_RC_EOF); } if (read) memcpy(buffer, d->data + d->index, read); d->index += read; return read; } static size_t streamWrite(SnsrStream stream, const void *buffer, size_t writeSize) { /* NOTE: For a live audio stream, implementing a streamWrite * would make no sense and this function should be removed. */ ProviderData *d = snsrStream_getData(stream); size_t available, written; written = writeSize; available = d->dataSize - d->index; if (written > available) { written = available; /* Session will end with SNSR_RC_STREAM_END */ snsrStream_setRC(stream, SNSR_RC_EOF); } if (written) memcpy(d->data + d->index, buffer, written); d->index += written; return written; } /* This is the interface any SnsrStream has to provide * (Virtual Method Table) */ static SnsrStream_Vmt methods = { "data", &streamOpen, &streamClose, &streamRelease, &streamRead, &streamWrite }; /* This is the 'constructor' for this kind of stream */ SnsrStream streamFromData(void *data, size_t dataSize, SnsrStreamMode mode) { SnsrStream dataStream; int readable = (mode == SNSR_ST_MODE_READ); int writeable = (mode == SNSR_ST_MODE_WRITE); /* The stream object has instance data (particular to this instance) * and virtual method pointers (particular to this type) * just like it would in C++ */ ProviderData *instanceData = malloc(sizeof(*instanceData)); memset(instanceData, 0, sizeof(*instanceData)); instanceData->data = data; instanceData->dataSize = dataSize; dataStream = snsrStream_alloc(&methods, instanceData, readable, writeable); if (data == NULL) snsrStream_setRC(dataStream, SNSR_RC_INVALID_ARG); return dataStream; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/" --- # C examples The C sample programs are available in _sample/c/_ in the TrulyNatural installation directory. See _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/_ You can build the [sample code](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#c-examples-list) with [CMake](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) or with [GNU Make](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-make). ## Examples [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spotc) - Shows how to run a wake word recognizer on live audio captured from the default audio source. For a shorter teaching version you type yourself, see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). [live-spot-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot-stream.md#live-spot-streamc) - Runs a wake word recognizer on live audio captured using a custom audio stream, defined in [alsa-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/alsa-stream.md#alsa-streamc). [live-segment.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment.md#live-segmentc) - Runs a wake word recognizer on live audio, segments the speech following the wake word with a VAD, and then saves this audio snippet to a file. [push-audio.c](https://doc.sensory.com/tnl/7.8/api/sample/c/push-audio.md#push-audioc) - Runs a recognizer where the application pushes data through the recognition pipeline. Shows VAD audio processing for use with third-party recognizers such as keyword-to-search applications. [spot-data.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data.md#spot-datac) - Runs a small keyword spotter from code space. It uses a [custom memory allocator](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloctlsf) to avoid calls to the system heap allocator, and reads audio data from code space to avoid file system use. [spot-data-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data-stream.md#spot-data-streamc) - This example runs a wake word from code space with a [custom audio stream](https://doc.sensory.com/tnl/7.8/api/sample/c/data-stream.md#data-streamc), using pull mode processing with [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run). It is a reasonable starting point for running on a small device with an RTOS. [alsa-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/alsa-stream.md#alsa-streamc) - Source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for [ALSA][], used for live audio capture on Linux. [aqs-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/aqs-stream.md#aqs-streamc) - Source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for [Audio Queue Services][], used for live audio capture on macOS and iOS. [wmme-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/wmme-stream.md#wmme-streamc) - Source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for [Windows Multimedia Extensions][], used for live audio capture on Windows. [data-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/data-stream.md#data-streamc) - This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for memory data, similar to [fromMemory](https://doc.sensory.com/tnl/7.8/api/io.md#frommemory). It's used in the [spot-data-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data-stream.md#spot-data-streamc) example. [snsr-edit.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-edit.md#snsr-editc) - Source for the [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) command-line tool. [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-evalc) - Source for the [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval) command-line tool, and the `snsr-eval-subset` sample. [spot-convert.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-convert.md#spot-convertc) - Source for the [spot-convert](https://doc.sensory.com/tnl/7.8/tools/spot-convert.md#spot-convert) command-line tool. [spot-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-enroll.md#spot-enrollc) - Source for the [spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll) command-line tool. [live-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-enroll.md#live-enrollc) - Source for the [live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll) command-line tool. ## Build with CMake Build the code samples with [CMake][] on Linux, macOS, and Windows. This requires CMake 3.10 or later and a compiler toolchain. Open a terminal window and enter the commands below. ```sh cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c cmake -S . -B build-sample cmake --build build-sample --parallel --config Release cmake --install build-sample ``` This installs the sample executables in the `bin/` subdirectory. ### CMakeLists.txt These are the `cmake` configuration files used to build the sample code. ### Details: CMakeLists.txt ```cmake # Sensory Confidential # Copyright (C)2024-2026 Sensory, Inc. https://sensory.com/ # # TrulyNatural SDK sample code build configuration # # Configure, build, and install these samples with: # # cmake -S . -B build-sample # cmake --build build-sample --parallel --config Release # cmake --install build-sample # # Then find the sample executables in build-sample/bin/ cmake_minimum_required(VERSION 3.10.0) if(POLICY CMP0177) cmake_policy(SET CMP0177 NEW) endif() project(SnsrSamples) list(APPEND CMAKE_MODULE_PATH "$ENV{HOME}/Sensory/TrulyNaturalSDK/7.9.0-pre.0") include(SnsrLibrary) add_subdirectory(src) ``` ### Details: src/CMakeLists.txt ```cmake # Sensory Confidential # Copyright (C)2024-2026 Sensory, Inc. https://sensory.com/ # # This is not a stand-alone configuration. See by ../CMakeLists.txt set(SAMPLE_BINARY_DIR ${CMAKE_CURRENT_SOURCE_DIR}/../bin) set(MODEL_DIR ${CMAKE_CURRENT_SOURCE_DIR}/../../../model) set(SRC_GEN ${PROJECT_BINARY_DIR}/src) set(SPT_HBG spot-hbg-enUS-1.4.0-m) set(TPL_VAD tpl-vad-lvcsr-3.17.0) add_executable(live-enroll live-enroll.c) target_link_libraries(live-enroll SnsrLibraryOmitOSS) install(TARGETS live-enroll DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(live-segment live-segment.c) target_link_libraries(live-segment SnsrLibraryOmitOSS) install(TARGETS live-segment DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(live-spot live-spot.c) target_link_libraries(live-spot SnsrLibrary) install(TARGETS live-spot DESTINATION ${SAMPLE_BINARY_DIR}) if (APPLE) add_executable(live-spot-stream live-spot-stream.c aqs-stream.c) target_link_libraries(live-spot-stream SnsrLibrary) install(TARGETS live-spot-stream DESTINATION ${SAMPLE_BINARY_DIR}) elseif (UNIX) add_executable(live-spot-stream live-spot-stream.c alsa-stream.c) target_link_libraries(live-spot-stream SnsrLibrary) install(TARGETS live-spot-stream DESTINATION ${SAMPLE_BINARY_DIR}) elseif (WIN32) add_executable(live-spot-stream live-spot-stream.c wmme-stream.c) target_link_libraries(live-spot-stream SnsrLibrary) install(TARGETS live-spot-stream DESTINATION ${SAMPLE_BINARY_DIR}) endif () add_executable(push-audio push-audio.c) target_link_libraries(push-audio SnsrLibrary) install(TARGETS push-audio DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(snsr-edit snsr-edit.c) target_link_libraries(snsr-edit SnsrLibrary) install(TARGETS snsr-edit DESTINATION ${SAMPLE_BINARY_DIR}) add_custom_command( OUTPUT ${SRC_GEN}/${SPT_HBG}.c COMMAND snsr-edit -c spot_hbg_enUS -t ${MODEL_DIR}/${SPT_HBG}.snsr DEPENDS snsr-edit ) add_custom_command( OUTPUT ${SRC_GEN}/${TPL_VAD}.c COMMAND snsr-edit -c tpl_vad_lvcsr -t ${MODEL_DIR}/${TPL_VAD}.snsr DEPENDS snsr-edit ) add_executable(snsr-eval snsr-eval.c ${SRC_GEN}/${TPL_VAD}.c) target_link_libraries(snsr-eval SnsrLibrary) install(TARGETS snsr-eval DESTINATION ${SAMPLE_BINARY_DIR}) add_custom_command( OUTPUT ${SRC_GEN}/snsr-custom-init.c COMMAND snsr-edit -it ${MODEL_DIR}/${SPT_HBG}.snsr DEPENDS snsr-edit ) add_executable(snsr-eval-subset snsr-eval.c ${SRC_GEN}/${TPL_VAD}.c ${SRC_GEN}/snsr-custom-init.c) target_link_libraries(snsr-eval-subset SnsrLibrary) target_compile_options(snsr-eval-subset PRIVATE -DSNSR_USE_SUBSET) install(TARGETS snsr-eval-subset DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(spot-convert spot-convert.c) target_link_libraries(spot-convert SnsrLibraryOmitOSS) install(TARGETS spot-convert DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(spot-data spot-data.c data.c ${SRC_GEN}/${SPT_HBG}.c) target_link_libraries(spot-data SnsrLibrary) install(TARGETS spot-data DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(spot-data-stream spot-data-stream.c data-stream.c data.c ${SRC_GEN}/${SPT_HBG}.c) target_link_libraries(spot-data-stream SnsrLibrary) install(TARGETS spot-data-stream DESTINATION ${SAMPLE_BINARY_DIR}) add_executable(spot-enroll spot-enroll.c) target_link_libraries(spot-enroll SnsrLibraryOmitOSS) install(TARGETS spot-enroll DESTINATION ${SAMPLE_BINARY_DIR}) ``` ## Build with GNU Make Build the code samples with [GNU Make][] on Linux and macOS only. This requires GNU Make 3.81 or later and a compiler toolchain. Open a terminal window and enter the commands below. ```sh cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c make -j all ``` This installs the sample executables in the `bin/` subdirectory. If you run `make` without arguments the `Makefile` lists all available targets: ```console % make Make targets: make all # build all executables in ./bin make clean # remove build artifacts make debug # build all with debugging enabled make help # display this help message make test # run enrollment and spotting tests Building for macos from SDK root directory ../.. ``` Run the sample tests: ```console % make -j -s test Running test-enroll-0. Running test-enroll-1. Running test-enroll-2. Running test-enroll-3. Running test-convert-0. Running test-data-0. Running test-data-1. Running test-subset-0. ./build/out/dsp-pc38-3.4.0-op10-prod-net.bin: OK ./build/out/dsp-pc38-3.4.0-op10-prod-search.bin: OK ./build/out/dsp-search-check.h: OK Say the enrollment phrase (1/4) for "armadillo-1" Recording: 1.88 s Say the enrollment phrase (2/4) for "armadillo-1" Recording: 1.71 s Say the enrollment phrase (3/4) for "armadillo-1" with context, for example: " will it rain tomorrow?" Recording: 4.02 s Say the enrollment phrase (4/4) for "armadillo-1" with context, for example: " will it rain tomorrow?" Recording: 2.91 s Running test-push-0. Running test-push-1. SUCCESS: All tests passed. ``` ### Makefile This is the `make` configuration file used to build and test the sample code. ### Details: Makefile ```makefile # Sensory Confidential # Copyright (C)2015-2026 Sensory, Inc. https://sensory.com/ # # TrulyNatural SDK GNU make build script SNSR_ROOT := ../.. SNSR_EDIT = $(BIN_DIR)/snsr-edit # This Makefile is meant to run on the target platform. # Uncomment the following line if cross-compiling instead. # SNSR_EDIT = $(TOOL_DIR)/snsr-edit # OS-specific compiler defaults OS_NAME := $(shell uname -s) ifeq ($(OS_NAME),Linux) # Linux ARCH_NAME := $(shell $(CC) -dumpmachine) OS_CFLAGS := -O3 -fPIC -DNDEBUG OS_CFLAGS += -Wall -Werror OS_CFLAGS += -fdata-sections -ffunction-sections OS_LIBS := -lsnsr -lasound -lpthread -lm -ldl -lstdc++ OS_LDFLAGS+= -Wl,--gc-sections STATSIZE := stat -c %s else ifeq ($(OS_NAME),Darwin) # macOS ARCH_NAME := macos ARCH := $(shell uname -m) XCODE := /Applications/Xcode.app/Contents/Developer SYSROOT := $(XCODE)/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk CC := $(XCODE)/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang OS_ARCH := -arch $(ARCH) OS_CFLAGS := -O3 -fPIC -DNDEBUG OS_CFLAGS += $(OS_ARCH) OS_CFLAGS += -Wall -Werror OS_CFLAGS += -isysroot $(SYSROOT) OS_CFLAGS += -fdata-sections -ffunction-sections OS_LDFLAGS+= -isysroot $(SYSROOT) OS_LDFLAGS+= -dead_strip OS_LDFLAGS+= $(OS_ARCH) OS_LIBS := -lsnsr -framework AudioToolbox -framework CoreFoundation OS_LIBS += -framework Foundation -framework Accelerate OS_LIBS += -lm -lstdc++ STATSIZE := stat -f %z else $(error This operating system ($(OS_NAME)) is not supported) endif OS_CFLAGS += -I$(SNSR_ROOT)/include OS_LDFLAGS += -L$(SNSR_ROOT)/lib/$(ARCH_NAME) TARGET_DIR := . BIN_DIR = $(TARGET_DIR)/bin SRC_DIR = $(TARGET_DIR)/src OBJ_DIR = $(BUILD_DIR)/obj OUT_DIR = $(BUILD_DIR)/out BUILD_DIR = $(TARGET_DIR)/build TEST_DIR = $(TARGET_DIR)/test MODEL_DIR = $(SNSR_ROOT)/model DATA_DIR = $(SNSR_ROOT)/data TOOL_DIR = $(SNSR_ROOT)/bin # $(call audio-files,filename-prefix,index-list) # e.g. $(call audio-files,armadillo-6-,0 1 2) # returns a list of absolute paths to SDK enrollment test data audio-files = $(addsuffix .wav,$(addprefix $(DATA_DIR)/enrollments/$1,$2)) TEST_DATA := $(call audio-files,armadillo-1-,0 1 2 3 4 5) TEST_DATA += $(call audio-files,armadillo-6-,0 1 2 3 4 5) TEST_DATA += $(call audio-files,jackalope-1-,0 1 2 3 4 5) TEST_DATA += $(call audio-files,jackalope-4-,0 1 2 3 4 5) TEST_DATA += $(call audio-files,terminator-2-,0 1 2 3 4 5) TEST_DATA += $(call audio-files,terminator-6-,0 1 2 3 4 5) TEST_DATA += $(call audio-files,armadillo-1-,0-c 1-c 2-c 3-c 4-c 5-c) TEST_DATA += $(call audio-files,jackalope-1-,0-c 1-c 2-c 3-c 4-c 5-c) UDT_MODEL = $(MODEL_DIR)/udt-universal-3.67.1.0.snsr UDT_MODEL_5 = $(MODEL_DIR)/udt-enUS-5.1.1.9.snsr VTPL_MODEL = $(MODEL_DIR)/tpl-spot-vad-3.13.0.snsr HBG_MODEL_V = spot-hbg-enUS-1.4.0-m HBG_MODEL = $(MODEL_DIR)/$(HBG_MODEL_V).snsr VG_MODEL = $(MODEL_DIR)/spot-voicegenie-enUS-6.5.1-m.snsr BASE_MODEL = $(OUT_DIR)/enrolled-sv VAD_MODEL_V = tpl-vad-lvcsr-3.17.0 VAD_MODEL = $(MODEL_DIR)/$(VAD_MODEL_V).snsr .PHONY: all clean debug help test .PHONY: test-enroll-0 test-enroll-1 test-enroll-2 test-enroll-3 .PHONY: test-convert-0 .PHONY: test-push-0 test-push-1 define help Make targets: make all # build all executables in $(BIN_DIR) make clean # remove build artifacts make debug # build all with debugging enabled make help # display this help message make test # run enrollment and spotting tests Building for $(ARCH_NAME) from SDK root directory $(SNSR_ROOT) endef # Adjust test program verbosity # Resolves to -v, unless make is run with the -s (silent) flag. v = $(if $(findstring s,$(MAKEFLAGS)),,-v) # Default target help:; $(info $(help)) clean: rm -rf $(BIN_DIR) $(BUILD_DIR) $(OBJ_DIR) $(OUT_DIR) segmented-audio.wav rm -f $(SRC_DIR)/snsr-custom-init.c rm -f $(SRC_DIR)/$(HBG_MODEL_V).c $(SRC_DIR)/$(VAD_MODEL_V).c debug: all debug: CFLAGS=-O0 -g -UNDEBUG test: test-enroll-0 test-enroll-1 test-enroll-2 test-enroll-3\ test-convert-0 test-push-0 test-push-1 test-data-0 test-data-1\ test-subset-0 $(info SUCCESS: All tests passed.) # End-to-end UDT enrollment test test-enroll-0: $(BIN_DIR)/spot-enroll $(BIN_DIR)/snsr-eval | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/spot-enroll $v $v -t $(UDT_MODEL)\ -o $(BASE_MODEL)-0.snsr\ +armadillo-6 $(call audio-files,armadillo-6-,0 1 2 3)\ +jackalope-4 $(call audio-files,jackalope-4-,0 1 2 3)\ +terminator-2 $(call audio-files,terminator-2-,0 1 2 3)\ +terminator-6 $(call audio-files,terminator-6-,0 1 2 3)\ +armadillo-1 $(call audio-files,armadillo-1-,0 1)\ -c $(call audio-files,armadillo-1-,0-c)\ -c $(call audio-files,armadillo-1-,1-c)\ +jackalope-1 $(call audio-files,jackalope-1-,0 1)\ -c $(call audio-files,jackalope-1-,0-c)\ -c $(call audio-files,jackalope-1-,1-c) $(BIN_DIR)/snsr-eval -t $(BASE_MODEL)-0.snsr $(TEST_DATA)\ > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 100) # End-to-end UDT enrollment test, using adapted enrollment contexts test-enroll-1: $(BIN_DIR)/spot-enroll $(BIN_DIR)/snsr-eval | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -a $(OUT_DIR)/armadillo-6.snsr\ -o $(OUT_DIR)/enrolled-armadillo-6.snsr\ +armadillo-6 $(call audio-files,armadillo-6-,0 1 2 3) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -a $(OUT_DIR)/jackalope-4.snsr\ -o $(OUT_DIR)/enrolled-jackalope-4.snsr\ +jackalope-4 $(call audio-files,jackalope-4-,0 1 2 3) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -a $(OUT_DIR)/terminator-2.snsr\ -o $(OUT_DIR)/enrolled-terminator-2.snsr\ +terminator-2 $(call audio-files,terminator-2-,0 1 2 3) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -a $(OUT_DIR)/terminator-6.snsr\ -o $(OUT_DIR)/enrolled-terminator-6.snsr\ +terminator-6 $(call audio-files,terminator-6-,0 1 2 3) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -a $(OUT_DIR)/armadillo-1.snsr\ -o $(OUT_DIR)/enrolled-armadillo.snsr\ +armadillo-1 $(call audio-files,armadillo-1-,0 1)\ -c $(call audio-files,armadillo-1-,0-c)\ -c $(call audio-files,armadillo-1-,1-c) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -a $(OUT_DIR)/jackalope-1.snsr\ -o $(OUT_DIR)/enrolled-jackalope-1.snsr\ +jackalope-1 $(call audio-files,jackalope-1-,0 1)\ -c $(call audio-files,jackalope-1-,0-c)\ -c $(call audio-files,jackalope-1-,1-c) $(BIN_DIR)/spot-enroll $v -t $(UDT_MODEL)\ -t $(OUT_DIR)/armadillo-6.snsr\ -t $(OUT_DIR)/jackalope-4.snsr\ -t $(OUT_DIR)/terminator-2.snsr\ -t $(OUT_DIR)/terminator-6.snsr\ -t $(OUT_DIR)/armadillo-1.snsr\ -t $(OUT_DIR)/jackalope-1.snsr\ -o $(BASE_MODEL)-1.snsr $(BIN_DIR)/snsr-eval -t $(BASE_MODEL)-1.snsr $(TEST_DATA)\ > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt >/dev/null\ || diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@-alt.txt >/dev/null\ || (echo ERROR: $@ validation failed; exit 101) # Live end-to-end UDT enrollment test. test-enroll-2: $(BIN_DIR)/live-enroll $(BIN_DIR)/snsr-eval | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/live-enroll $v $v -t $(UDT_MODEL)\ -o $(BASE_MODEL)-2.snsr\ +armadillo-1 $(call audio-files,armadillo-1-,0 1 0-c 1-c) $(BIN_DIR)/snsr-eval -t $(BASE_MODEL)-2.snsr $(TEST_DATA)\ > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 102) # Test old UDT model test-enroll-3: $(BIN_DIR)/spot-enroll $(BIN_DIR)/snsr-eval | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/spot-enroll $v $v -t $(UDT_MODEL_5)\ -o $(BASE_MODEL)-3.snsr\ +armadillo-1 $(call audio-files,armadillo-1-,0 1 2 3) $(BIN_DIR)/snsr-eval -t $(BASE_MODEL)-3.snsr $(TEST_DATA)\ > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 103) # Validate DSP conversion test-convert-0: $(BIN_DIR)/spot-convert | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/spot-convert -t $(HBG_MODEL) -p $(OUT_DIR)/dsp pc38 tail -10 $(OUT_DIR)/dsp-pc38-3.4.0-op10-prod-search.h\ > $(OUT_DIR)/dsp-search-check.h shasum -c $(TEST_DIR)/dsp-checksum.txt # Push audio samples instead of the default pull # Uses test-enroll-0 models test-push-0: test-enroll-0 $(BIN_DIR)/push-audio | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/push-audio $(BASE_MODEL)-0.snsr\ $(call audio-files,jackalope-4-,0)\ > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 104) # Push audio samples instead of the default pull # Uses test-enroll-0 models and the tpl-spot-vad-*.snsr template test-push-1: test-enroll-0 $(BIN_DIR)/push-audio $(SNSR_EDIT) | $(OUT_DIR) $(info Running $@.) $(SNSR_EDIT) -t $(VTPL_MODEL)\ -f 0 $(BASE_MODEL)-0.snsr -o $(OUT_DIR)/spot-vad.snsr $(BIN_DIR)/push-audio $(OUT_DIR)/spot-vad.snsr\ $(call audio-files,armadillo-1-,1-c)\ > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 105) test-data-0: $(BIN_DIR)/spot-data | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/spot-data > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 104) test-data-1: $(BIN_DIR)/spot-data-stream | $(OUT_DIR) $(info Running $@.) $(BIN_DIR)/spot-data-stream > $(OUT_DIR)/$@.txt diff $(OUT_DIR)/$@.txt $(TEST_DIR)/$@.txt\ || (echo ERROR: $@ validation failed; exit 104) test-subset-0: $(BIN_DIR)/snsr-eval-subset $(BIN_DIR)/snsr-eval | $(OUT_DIR) $(info Running $@.) test $(shell $(STATSIZE) $(BIN_DIR)/snsr-eval-subset) -lt \ $(shell $(STATSIZE) $(BIN_DIR)/snsr-eval) ||\ (echo ERROR: $@ size validation failed; exit 105) $(BIN_DIR)/snsr-eval-subset -t $(HBG_MODEL) /dev/null ||\ (echo ERROR: $@ validation failed; exit 106) $(BIN_DIR)/snsr-eval-subset -t $(VG_MODEL) /dev/null 2>&1 |\ grep SNSR_USE_SUBSET >/dev/null ||\ (echo ERROR: $@ validation failed; exit 107) # Create a rule for building name from source, in $(BIN_DIR) # $(call add-target-rule,name,source1.c source2.c ...) add-target-rule = $(eval $(call emit-target-rule,$1,$2)) define emit-target-rule all: $$(BIN_DIR)/$(strip $1) $$(BIN_DIR)/$(strip $1): $$(addprefix $$(OBJ_DIR)/,$(2:.c=.o)) | $$(BIN_DIR) $$(CC) $$(OS_LDFLAGS) $$(LDFLAGS) -o $$@ $$^ $$(OS_LIBS) $$(LIBS) endef # Command-line application targets $(call add-target-rule, spot-convert, spot-convert.c) $(call add-target-rule, snsr-edit, snsr-edit.c) $(call add-target-rule, spot-enroll, spot-enroll.c) $(call add-target-rule, snsr-eval, snsr-eval.c tpl-vad-lvcsr-3.17.0.c) $(call add-target-rule, snsr-eval-subset,\ snsr-eval-subset.c snsr-custom-init.c tpl-vad-lvcsr-3.17.0.c) $(call add-target-rule, live-enroll, live-enroll.c) $(call add-target-rule, live-segment, live-segment.c) $(call add-target-rule, live-spot, live-spot.c) $(call add-target-rule, push-audio, push-audio.c) $(call add-target-rule, spot-data,\ spot-data.c spot-hbg-enUS-1.4.0-m.c data.c) $(call add-target-rule, spot-data-stream,\ spot-data-stream.c data-stream.c spot-hbg-enUS-1.4.0-m.c data.c) ifeq ($(OS_NAME),Linux) # The custom stream sample uses ALSA on Linux. $(call add-target-rule, live-spot-stream, live-spot-stream.c alsa-stream.c) else ifeq ($(OS_NAME),Darwin) # The custom stream sample uses AQS on macOS. $(call add-target-rule, live-spot-stream, live-spot-stream.c aqs-stream.c) endif # Build object files from C sources $(OBJ_DIR)/%.o : $(SRC_DIR)/%.c | $(OBJ_DIR) $(CC) -c $(OS_CFLAGS) $(CFLAGS) -o $@ $< # spot-enroll doesn't use OSS modules $(OBJ_DIR)/spot-enroll.o : $(SRC_DIR)/spot-enroll.c | $(OBJ_DIR) $(CC) -DSNSR_OMIT_OSS_COMPONENTS -c $(OS_CFLAGS) $(CFLAGS) -o $@ $< # Create $(SRC_DIR)/snsr-custom-init.c using snsr-edit, # limit support to those modules needed for $(HBG_MODEL) $(SRC_DIR)/snsr-custom-init.c: $(SNSR_EDIT) $(SNSR_EDIT) -o $@ -vit $(HBG_MODEL) # Create $(SRC_DIR)/spot-hbg-enUS-*.c from the snsr model $(SRC_DIR)/$(HBG_MODEL_V).c: $(SNSR_EDIT) $(SNSR_EDIT) -o $@ -c spot_hbg_enUS -vt $(HBG_MODEL) # Create $(SRC_DIR)/tpl-vad-lvcsr-*.c from the snsr model $(SRC_DIR)/$(VAD_MODEL_V).c: $(SNSR_EDIT) $(SNSR_EDIT) -o $@ -c tpl_vad_lvcsr -vt $(VAD_MODEL) # Build snsr-eval-subset object files with -DSNSR_USE_SUBSET $(OBJ_DIR)/snsr-eval-subset.o: $(SRC_DIR)/snsr-eval.c | $(OBJ_DIR) $(CC) -c $(OS_CFLAGS) $(CFLAGS) -DSNSR_USE_SUBSET -o $@ $< # Create output directories $(BIN_DIR) $(BUILD_DIR) $(OBJ_DIR) $(OUT_DIR): mkdir -p $@ ``` [ALSA]: http://www.alsa-project.org/main/index.php/ALSA_Library_API "Advanced Linux Sound Architecture" [Audio Queue Services]: https://developer.apple.com/documentation/audiotoolbox/audio-queue-services "Audio Toolbox, Core Audio" [CMake]: https://cmake.org/ "CMake: A Powerful Software Build System" [GNU Make]: https://www.gnu.org/software/make/ "GNU Make build automation tool" [Windows Multimedia Extensions]: https://learn.microsoft.com/en-us/windows/win32/api/mmeapi/nf-mmeapi-waveinopen "waveInOpen" *[ALSA]: Advanced Linux Sound Architecture *[API]: Application Programming Interface *[AQS]: Audio Queue Services, Apple's audio capture API on Darwin / macOS *[RTOS]: Real-Time Operating System *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector *[WMME]: Windows Multimedia Extensions, the audio capture API on Windows --- source_path: "api/sample/c/live-enroll.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/live-enroll/" --- # live-enroll.c This is the source code for the [live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll) command-line tool. For Python UDT enrollment, see [live_enroll.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_enroll.md#live_enrollpy). ## Instructions See [live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/live-enroll.c_ **live-enroll.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK keyword spotting command-line enrollment utility, * using live audio from the default capture source. *------------------------------------------------------------------------------ */ #include #include #include #include #define DEFAULT_OUT "enrolled-sv.snsr" #define ENROLL_TASK_VERSION "~0.8.0 || 1.0.0" #ifdef _MSC_VER # define strdup _strdup # if _MSC_VER < 1900 # define snprintf _snprintf # endif #endif typedef struct { const char *enroll; /* optional enrollment context file name */ const char *model; /* enrolled phrase spotter output file name */ const char *prefix; /* audio capture file name prefix */ const char **user; /* pointer to users in argv[] */ char *phrase; /* enrollment phrase */ SnsrStream audio; /* audio stream handle */ int verbosity; /* incremented by the -v flag */ int userCount; /* number of users to enroll */ int currentUser; /* current user index */ int failCount; /* number of failed enrollment attempts */ } EnrollContext; static SnsrRC saveEnrollmentAudio(SnsrSession s, EnrollContext *e, const char *tag, int id) { SnsrStream out, enrollment; SnsrRC r; const char *dash, *user = NULL; const char *format = "%s%s%s-%s-%i.wav"; char *tmp; int len; dash = *e->prefix? "-": ""; snsrGetString(s, SNSR_USER, &user); r = snsrGetStream(s, SNSR_AUDIO_STREAM, &enrollment); if (r != SNSR_RC_OK) return r; if (snsrStreamRC(enrollment) != SNSR_RC_OK) return SNSR_RC_OK; len = snprintf(NULL, 0, format, e->prefix, dash, user, tag, id); if (len < 0) { snsrDescribeError(s, "snprintf() failed, rc = %i\n", len); return SNSR_RC_ERROR; } tmp = malloc(++len); if (!tmp) return SNSR_RC_NO_MEMORY; snprintf(tmp, len, format, e->prefix, dash, user, tag, id); out = snsrStreamFromAudioFile(tmp, "w", SNSR_ST_AF_DEFAULT); snsrRetain(out); snsrStreamCopy(out, enrollment, SIZE_MAX); if ((r = snsrStreamRC(out)) == SNSR_RC_EOF) r = SNSR_RC_OK; if (r != SNSR_RC_OK) snsrDescribeError(s, "%s", snsrStreamErrorDetail(out)); else if (e->verbosity > 0) { printf("Saved enrollment audio to %s\n", tmp); fflush(stdout); } snsrRelease(out); free(tmp); return r; } static SnsrRC nextEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; const char *tag; if (e->currentUser >= e->userCount) return SNSR_RC_OK; tag = e->user[e->currentUser++] + 1; return snsrSetString(s, SNSR_USER, tag); } static SnsrRC passEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; int id = 0; if (e->verbosity >= 1) { printf("Preliminary enrollment checks passed.\n"); fflush(stdout); } if (!e->prefix) return SNSR_RC_OK; snsrGetInt(s, SNSR_RES_ENROLLMENT_ID, &id); return saveEnrollmentAudio(s, e, "pass", id); } static SnsrRC pauseEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; snsrStreamClose(e->audio); printf("\n"); fflush(stdout); return SNSR_RC_OK; } static SnsrRC resumeEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; const char *tag; int count, target; int ctx; snsrStreamOpen(e->audio); snsrGetInt(s, SNSR_ENROLLMENT_TARGET, &target); snsrGetInt(s, SNSR_RES_ENROLLMENT_COUNT, &count); snsrGetInt(s, SNSR_ADD_CONTEXT, &ctx); r = snsrGetString(s, SNSR_USER, &tag); if (r == SNSR_RC_OK) { printf("\nSay %s (%i/%i) for \"%s\"", e->phrase, count + 1, target, tag); if (ctx) printf(" with context,\n for example: " "\" will it rain tomorrow?\""); printf("\n"); fflush(stdout); } return r; } static SnsrRC samplesEvent(SnsrSession s, const char *key, void *privateData) { double count; snsrGetDouble(s, SNSR_RES_SAMPLES, &count); printf("Recording: %6.2f s\r", count / 16000.0); fflush(stdout); return SNSR_RC_OK; } static SnsrRC doneEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; SnsrStream model = NULL, out; size_t written; r = snsrGetStream(s, SNSR_MODEL_STREAM, &model); if (r != SNSR_RC_OK) return r; written = snsrStreamGetMeta(model, SNSR_ST_META_BYTES_WRITTEN); out = snsrStreamFromFileName(e->model, "w"); snsrStreamCopy(out, model, written); r = snsrStreamRC(out); if (r != SNSR_RC_OK) snsrDescribeError(s, "%s", snsrStreamErrorDetail(out)); snsrRelease(out); if (r == SNSR_RC_OK && e->verbosity >= 1) { printf("Enrolled model saved to \"%s\"\n", e->model); fflush(stdout); } if (r != SNSR_RC_OK) return r; return SNSR_RC_STOP; } static SnsrRC enrolledEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; r = snsrSave(s, SNSR_FM_RUNTIME, snsrStreamFromFileName(e->enroll, "w")); if (r == SNSR_RC_OK && e->verbosity >= 1) { printf("Enrollment context saved to \"%s\"\n", e->enroll); fflush(stdout); } return r; } static SnsrRC printReason(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; const char *guidance, *reason; int pass = 0; double value = 0.0, threshold = 0.0; snsrGetInt(s, SNSR_RES_REASON_PASS, &pass); if (pass) return snsrRC(s); snsrGetString(s, SNSR_RES_REASON, &reason); snsrGetString(s, SNSR_RES_GUIDANCE, &guidance); snsrGetDouble(s, SNSR_RES_REASON_VALUE, &value); snsrGetDouble(s, SNSR_RES_REASON_THRESHOLD, &threshold); if (snsrRC(s) == SNSR_RC_OK) { fprintf(stderr, "This enrollment recording is not usable.\n"); fprintf(stderr, " Reason: %s", reason); if (e->verbosity >= 2) fprintf(stderr, " (%.2f, threshold is %.2f)", value, threshold); fprintf(stderr, "\n Fix: %s\n", guidance); fflush(stdout); } return snsrRC(s); } static SnsrRC failEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; printReason(s, key, privateData); if (e->verbosity >= 3) { fprintf(stderr, "\nAll failed checks:\n"); fflush(stdout); snsrForEach(s, SNSR_REASON_LIST, snsrCallback(printReason, NULL, e)); } if (!e->prefix) return SNSR_RC_OK; return saveEnrollmentAudio(s, e, "fail", e->failCount++); } static SnsrRC progEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r = SNSR_RC_OK; EnrollContext *e = (EnrollContext *)privateData; if (e->verbosity >= 1) { double progress; r = snsrGetDouble(s, SNSR_RES_PERCENT_DONE, &progress); if (r == SNSR_RC_OK) { printf("\rAdapting: %3.0f%% complete.", progress); if (progress >= 100) printf("\n"); fflush(stdout); } } return r; } static void fatal(int rc, const char *format, ...) { va_list a; fprintf(stderr, "ERROR: "); va_start(a, format); vfprintf(stderr, format, a); va_end(a); fprintf(stderr, "\n"); exit(rc); } static const char *usageDetail = "Settings are strings used as keys to query or change task behavior.\n" "Most frequently used for enrollment is accuracy, which takes a value\n" "between 0 and 1.\n" "Refer to the " SNSR_NAME " SDK documentation for a complete list and\n" "descriptions of all supported settings.\n"; static void usage(const char *name) { SnsrSession s; const char *libInfo; fprintf(stderr, "Enrolls " SNSR_NAME " SDK wake words on live audio.\n\n"); fprintf(stderr, "usage: %s -t task [options] +user1 [+user2 ...] [file ...]\n" " options:\n" " -e enrollments : enrollment context output filename\n" " -o out : enrolled model output filename (default: " DEFAULT_OUT ")\n" " -p prefix : capture each enrollment to file as\n" " --{pass,fail}-.wav\n" " -s setting=value : override a task setting\n" " -t task : specify task filename (required)\n" " -v [-v [-v]] : increase verbosity\n", name); fprintf(stderr, "\nEnrollment audio is captured from the default microphone, unless\n" "the optional [file ...] arguments are supplied.\n"); fprintf(stderr, "\n%s", usageDetail); snsrNew(&s); snsrGetString(s, SNSR_LIBRARY_INFO, &libInfo); fprintf(stderr, "\n%s\n", libInfo); snsrRelease(s); exit(199); } /* Report model license keys. */ static void reportModelLicense(SnsrSession s, const char *modelfile, int verbose) { const char *msg = NULL; if (verbose > 1) { snsrGetString(s, SNSR_MODEL_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", modelfile, msg); } msg = NULL; snsrGetString(s, SNSR_MODEL_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for model \"%s\": %s.\n", modelfile, msg); } /* Store the first enrollment phrase in e.phrase. */ static SnsrRC getVocab(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; const char *vocab; r = snsrGetString(s, SNSR_RES_TEXT, &vocab); if (r != SNSR_RC_OK) return r; free(e->phrase); e->phrase = strdup(vocab); return r; } int main(int argc, char *argv[]) { EnrollContext e; SnsrRC r; SnsrSession s; int i, o; const char *msg = NULL; extern char *optarg; extern int optind; #ifdef SNSR_USE_SECURITY_CHIP uint32_t *securityChipComms(uint32_t *in); snsrConfig(SNSR_CONFIG_SECURITY_CHIP, securityChipComms); #endif if (argc == 1) usage(argv[0]); r = snsrNew(&s); if (r != SNSR_RC_OK) fatal(r, s? snsrErrorDetail(s): snsrRCMessage(r)); e.currentUser = 0; e.enroll = NULL; e.phrase = strdup("the enrollment phrase"); e.prefix = NULL; e.model = DEFAULT_OUT; e.failCount = 0; e.userCount = 0; e.verbosity = 0; while ((o = getopt(argc, argv, "e:o:p:s:t:v?")) >= 0) { switch (o) { case 'e': e.enroll = optarg; break; case 'o': e.model = optarg; break; case 'p': e.prefix = optarg; r = snsrSetInt(s, SNSR_SAVE_ENROLLMENT_AUDIO, 1); if (r == SNSR_RC_NO_MODEL) fatal(r, "set -t task before -p prefix"); break; case 's': r = snsrSet(s, optarg); if (r == SNSR_RC_NO_MODEL) fatal(r, "set -t task before -s setting=value"); else if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); break; case 't': snsrLoad(s, snsrStreamFromFileName(optarg, "r")); snsrRequire(s, SNSR_TASK_TYPE, SNSR_ENROLL); r = snsrRequire(s, SNSR_TASK_VERSION, ENROLL_TASK_VERSION); if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); reportModelLicense(s, optarg, e.verbosity); break; case 'v': e.verbosity++; break; case '?': default: usage(argv[0]); } } if (optind == argc || argv[optind][0] != '+') usage(argv[0]); /* Report application license status */ if (e.verbosity > 1) { snsrGetString(s, SNSR_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", argv[0], msg); } msg = NULL; snsrGetString(s, SNSR_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for \"%s\": %s.\n", argv[0], msg); r = snsrSetInt(s, SNSR_INTERACTIVE_MODE, 1); if (r == SNSR_RC_NO_MODEL) usage(argv[0]); snsrSetHandler(s, SNSR_NEXT_EVENT, snsrCallback(nextEvent, NULL, &e)); snsrSetHandler(s, SNSR_DONE_EVENT, snsrCallback(doneEvent, NULL, &e)); snsrSetHandler(s, SNSR_FAIL_EVENT, snsrCallback(failEvent, NULL, &e)); snsrSetHandler(s, SNSR_PASS_EVENT, snsrCallback(passEvent, NULL, &e)); snsrSetHandler(s, SNSR_PROG_EVENT, snsrCallback(progEvent, NULL, &e)); snsrSetHandler(s, SNSR_PAUSE_EVENT, snsrCallback(pauseEvent, NULL, &e)); snsrSetHandler(s, SNSR_RESUME_EVENT, snsrCallback(resumeEvent, NULL, &e)); snsrSetHandler(s, SNSR_SAMPLES_EVENT, snsrCallback(samplesEvent, NULL, &e)); if (e.enroll) snsrSetHandler(s, SNSR_ENROLLED_EVENT, snsrCallback(enrolledEvent, NULL, &e)); /* SNSR_VOCAB_LIST is supported for a subset of models only, ignore errors */ if (snsrRC(s) == SNSR_RC_OK) { snsrForEach(s, SNSR_VOCAB_LIST, snsrCallback(getVocab, NULL, &e)); snsrClearRC(s); } for (i = optind; i < argc && argv[i][0] == '+'; i++) ; e.user = (const char **)argv + optind; e.userCount = i - optind; if (i == argc) { e.audio = snsrStreamFromAudioDevice(SNSR_ST_AF_DEFAULT); } else { SnsrStream tmp; e.audio = snsrStreamFromString(""); for (; i < argc; i++) { tmp = snsrStreamFromFileName(argv[i], "r"); tmp = snsrStreamFromAudioStream(tmp, SNSR_ST_AF_DEFAULT); e.audio = snsrStreamFromStreams(e.audio, tmp); } } snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, e.audio); r = snsrRun(s); if (r != SNSR_RC_OK && r != SNSR_RC_STOP) fatal(snsrRC(s), snsrErrorDetail(s)); free(e.phrase); snsrRelease(s); snsrTearDown(); return 0; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/c/live-segment.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment/" --- # live-segment.c This example runs a wake word recognizer on live audio, segments the speech following the wake word with a VAD, and then saves this audio snippet to a file. ## Instructions [Build](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) the sample code. In the same terminal window type the command after the `%`, then say "Voice Genie will it rain in Portland tomorrow?" ```console % ./bin/live-segment ../../model/tpl-spot-vad-3.13.0.snsr ../../model/spot-voicegenie-enUS-6.5.1-m.snsr Say will it rain in Portland tomorrow? Spotted "voicegenie", listening... VAD detected speech from 3150 ms to 5055 ms. Wrote recording to "vad-audio.wav". ``` `vad-audio.wav` contains the speech after the wake word. ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/live-segment.c_ **live-segment.c:** ```c /* Sensory Confidential * Copyright (C)2017-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK keyword spotter, runs trailing audio through * a voice activity detector and saves it to file. *------------------------------------------------------------------------------ */ #include #include /* Set INCLUDE_SPOT to 1 to include the trigger phrase in the audio output */ #define INCLUDE_SPOT 0 /* Output filename */ #define VAD_AUDIO_FILE "vad-audio.wav" /* Print an error message and exit. */ static void fatalError(int rc, const char *msg) { fprintf(stderr, "ERROR: %s\n", msg); exit(rc); } /* Result callback function, see snsrSetHandler() in main() below. */ static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *phrase; r = snsrGetString(s, SNSR_RES_TEXT, &phrase); if (r == SNSR_RC_OK) { printf("Spotted \"%s\", listening...\n", phrase); fflush(stdout); } return r; } /* VAD segmentation callback - speech endpoint detected */ static SnsrRC endpointEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; double begin, end; snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); r = snsrGetDouble(s, SNSR_RES_END_MS, &end); if (r != SNSR_RC_OK) return r; printf("VAD detected speech from %.0f ms to %.f ms.\n", begin, end); return SNSR_RC_STOP; } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; SnsrStream audio, out; const char *spotModel, *tmplModel; int testing = 0; if (argc < 3 || argc > 4) { fprintf(stderr, "usage: %s tmpl-vad-model spot-model [--test]\n", argv[0]); exit(1); } tmplModel = argv[1]; spotModel = argv[2]; testing = argc == 4; /* Create a new session handle. */ snsrNew(&s); /* Load and validate the spotter-vad template task file. */ snsrLoad(s, snsrStreamFromFileName(tmplModel, "r")); snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT_VAD); /* Load the spotter into template slot 0. */ snsrSetStream(s, SNSR_SLOT_0, snsrStreamFromFileName(spotModel, "r")); /* If requested, include the trigger phrase in the audio output. */ snsrSetInt(s, SNSR_INCLUDE_LEADING_SILENCE, INCLUDE_SPOT); /* Register VAD endpoint callbacks. */ snsrSetHandler(s, SNSR_END_EVENT, snsrCallback(endpointEvent, NULL, NULL)); snsrSetHandler(s, SNSR_LIMIT_EVENT, snsrCallback(endpointEvent, NULL, NULL)); /* Register a result callback. Private data handle is used as a flag. */ snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, NULL)); /* Create an audio stream instance and attach it to the session. */ if (testing) { /* Read from stdin for testing. */ audio = snsrStreamFromFILE(stdin, SNSR_ST_MODE_READ); /* Reduce the trailing silence time-out, as test recordings have less than * 1000 ms of silence at the end */ snsrSetInt(s, SNSR_TRAILING_SILENCE, 500); /* Reduce VAD margins to the absolute minimum for testing only. This could * lead to small portions of the beginning and end of the audio being lost. * The recommendation is to use default values for production code. */ snsrSetInt(s, SNSR_BACKOFF, 0); snsrSetInt(s, SNSR_HOLD_OVER, 0); } else { /* live audio */ audio = snsrStreamFromAudioDevice(SNSR_ST_AF_DEFAULT); } snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, audio); /* Set up the output stream. Speech-detected audio will be written to * this file. */ out = snsrStreamFromFileName(VAD_AUDIO_FILE, "w"); out = snsrStreamFromAudioStream(out, SNSR_ST_AF_DEFAULT); snsrSetStream(s, SNSR_SINK_AUDIO_PCM, out); printf("Say will it rain in Portland tomorrow?\n"); /* Main recognition loop. The endpoint handler will cause snsrRun() to * return SNSR_RC_STOP. Other return codes indicate an unexpected error. * Session errors remain until explicitly cleared: Any errors that occured * earlier will also be reported here. */ r = snsrRun(s); if (r != SNSR_RC_STOP) fatalError(r, snsrErrorDetail(s)); snsrRelease(s); printf("Wrote recording to \"%s\".\n", VAD_AUDIO_FILE); return 0; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/sample/c/live-spot-stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot-stream/" --- # live-spot-stream.c This example shows how to run a recognizer on live audio captured using a custom audio stream. **Also see these related items:** [alsa-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/alsa-stream.md#alsa-streamc), [aqs-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/aqs-stream.md#aqs-streamc), [wmme-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/wmme-stream.md#wmme-streamc), [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider) ## Instructions 1. [Build](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) the sample code. In the same terminal window type the command after the `%`, then say "hello blue genie": ```console % ./bin/live-spot-stream ../../model/spot-hbg-enUS-1.4.0-m.snsr Spotted "hello blue genie" at 5.34 seconds. ``` 2. This example works with any [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) recognizer type, so let's create a concurrent combination of two wake words: ```console % ./bin/snsr-edit -vt ../../model/tpl-spot-concurrent-1.5.0.snsr \ -f 0 ../../model/spot-voicegenie-enUS-6.5.1-m.snsr \ -f 1 ../../model/spot-hbg-enUS-1.4.0-m.snsr \ -o spot-vg-or-hbg.snsr Output written to "spot-vg-or-hbg.snsr". ``` _spot-vg-or-hbg.snsr_ listens for both "voice genie" and "hello blue genie". Run the `live-spot-stream` example again and say one of these phrases: ```console % ./bin/live-spot-stream spot-vg-or-hbg.snsr Spotted "voicegenie" at 6.75 seconds. % ./bin/live-spot-stream spot-vg-or-hbg.snsr Spotted "hello blue genie" at 8.28 seconds. ``` 3. _(TrulyNatural only)_ A VAD combined with either an LVCSR or STT model also works: ```console % ./bin/snsr-edit -vt ../../model/tpl-vad-lvcsr-3.17.0.snsr \ -f 0 ../../model/lvcsr-build-enUS-14.0.2-5MB.snsr \ -g phrases-stream "hello world; this is a test sentence; stop everything" \ -o vad-lvcsr.snsr Output written to "vad-lvcsr.snsr". % ./bin/live-spot-stream vad-lvcsr.snsr Spotted "this is a test sentence" at 4.46 seconds. % ./bin/live-spot-stream vad-lvcsr.snsr Spotted "" at 2.15 seconds. % ./bin/live-spot-stream vad-lvcsr.snsr Spotted "stop everything" at 1.88 seconds. ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/live-spot-stream.c_ **live-spot-stream.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * Keyword spotting minimal example using a custom audio stream. *------------------------------------------------------------------------------ */ #include #include /* Use the custom audio stream for this operating system */ #ifdef __linux__ # include "alsa-stream.h" # define streamFromCustom(r, m, l) streamFromALSA("default", r, m, l) #elif defined(__APPLE__) # include "aqs-stream.h" # define streamFromCustom(r, m, l) streamFromAQS(r, m, l) #elif defined(_WIN32) # include "wmme-stream.h" # define streamFromCustom(r, m, l) streamFromWMME(-1, r, m, l) #else # error Not supported on this platform. #endif /* Result callback function, see snsrSetHandler() in main() below. * Print the result text and the start time of the first spotted phrase. */ static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *phrase; double begin; /* Retrieve the phrase text and start time from the session handle. */ snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); r = snsrGetString(s, SNSR_RES_TEXT, &phrase); /* Quit early if an error occurred. */ if (r != SNSR_RC_OK) return r; printf("Spotted \"%s\" at %.2f seconds.\n", phrase, begin/1000.0); /* Returning a code other than SNSR_RC_OK instructs snsrRun() to return it. */ return SNSR_RC_STOP; } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; if (argc != 2) { fprintf(stderr, "usage: %s spotter-model\n", argv[0]); exit(1); } /* Create a new session handle. */ snsrNew(&s); /* Load and validate the spotter model task file. */ snsrLoad(s, snsrStreamFromFileName(argv[1], "r")); snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT); /* Create a live audio stream instance using a custom stream type, * then attach it to the session. */ snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, streamFromCustom(16000, SNSR_ST_MODE_READ, STREAM_LATENCY_LOW)); /* Register a result callback. Private data handle is not used */ snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, NULL)); /* Main recognition loop. The result handler will cause snsrRun() to * return SNSR_RC_STOP. Other return codes indicate an unexpected error. * Session errors remain until explicitly cleared: Any errors that occured * earlier will also be reported here. */ r = snsrRun(s); if (r != SNSR_RC_STOP) fprintf(stderr, "ERROR: %s\n", snsrErrorDetail(s)); /* Release the session. This will also release the model and audio streams, * and the callback handler. No other references to these handles are held, * so their memory will be reclaimed. */ snsrRelease(s); return r; } ``` *[ALSA]: Advanced Linux Sound Architecture *[API]: Application Programming Interface *[AQS]: Audio Queue Services, Apple's audio capture API on Darwin / macOS *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector *[WMME]: Windows Multimedia Extensions, the audio capture API on Windows --- source_path: "api/sample/c/live-spot.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot/" --- # live-spot.c New to the Session API? Start with [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program) (`first-spot.c`), then return here for the full sample. The same [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load) + [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) pattern works for composed models built with [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit)—only the `.snsr` path changes (see step 3 below and [Quick start § Speech To Text](https://doc.sensory.com/tnl/7.8/getting-started/index.md#qs-stt)). This example shows how to run a recognizer on live audio captured from the default audio source. For Python, see [hello_world.py](https://doc.sensory.com/tnl/7.8/api/sample/python/hello_world.md#hello_worldpy) for the same pull-mode [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load) + [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) pattern on file audio, or [live_audio.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_audio.md#live_audiopy) for microphone capture. ## Instructions 1. [Build](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) the sample code. In the same terminal window type the command after the `%`, then say "voice genie": ```console % ./bin/live-spot ../../model/spot-voicegenie-enUS-6.5.1-m.snsr Spotted "voicegenie" at 4.56 seconds. ``` 2. This example works with any [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) recognizer type, so let's create a sequential combination of a wake word and a command set: ```console % ./bin/snsr-edit -vt ../../model/tpl-spot-sequential-1.5.0.snsr \ -f 0 ../../model/spot-voicegenie-enUS-6.5.1-m.snsr \ -f 1 ../../model/spot-music-enUS-1.2.0-m.snsr\ -o spot-vg-music.snsr Output written to "spot-vg-music.snsr". ``` _spot-vg-music.snsr_ listens for the wake word "voice genie". Once detected, it listens for up to five seconds for a small set of music player commands "play music", "stop music", "previous song", "next song", and "pause music". Run the `live-spot` example again and say some of the music commands. Then try "voice genie" followed by one of the commands again: ```console % ./bin/live-spot spot-vg-music.snsr Spotted "play_music" at 10.32 seconds. ``` 3. _(STT only)_ A VAD combined with either an LVCSR or STT model also works: ```console % ./bin/snsr-edit -vt ../../model/tpl-vad-lvcsr-3.17.0.snsr \ -f 0 ../../model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \ -o vad-stt.snsr Output written to "vad-stt.snsr". % ./bin/live-spot vad-stt.snsr Spotted "Hello world. This is a test sentence." at 2.61 seconds. ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/live-spot.c_ **live-spot.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK keyword spotting minimal example. *------------------------------------------------------------------------------ */ #include #include /* Result callback function, see snsrSetHandler() in main() below. * Print the result text and the start time of the first spotted phrase. */ static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *phrase; double begin; /* Retrieve the phrase text and start time from the session handle. */ snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); r = snsrGetString(s, SNSR_RES_TEXT, &phrase); /* Quit early if an error occurred. */ if (r != SNSR_RC_OK) return r; printf("Spotted \"%s\" at %.2f seconds.\n", phrase, begin/1000.0); /* Returning a code other than SNSR_RC_OK instructs snsrRun() to return it. */ return SNSR_RC_STOP; } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; if (argc != 2) { fprintf(stderr, "usage: %s spotter-model\n", argv[0]); exit(1); } /* Create a new session handle. */ snsrNew(&s); /* Load and validate the spotter model task file. */ snsrLoad(s, snsrStreamFromFileName(argv[1], "r")); snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT); /* Create a live audio stream instance and attach it to the session. */ snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, snsrStreamFromAudioDevice(SNSR_ST_AF_DEFAULT)); /* Register a result callback. Private data handle is not used */ snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, NULL)); /* Main recognition loop. The result handler will cause snsrRun() to * return SNSR_RC_STOP. Other return codes indicate an unexpected error. * Session errors remain until explicitly cleared: Any errors that occured * earlier will also be reported here. */ r = snsrRun(s); if (r != SNSR_RC_STOP) fprintf(stderr, "ERROR: %s\n", snsrErrorDetail(s)); /* Release the session. This will also release the model and audio streams, * and the callback handler. No other references to these handles are held, * so their memory will be reclaimed. */ snsrRelease(s); return r; } ``` *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[PNC]: Punctuation and Capitalization, an STT model variant that emits cased text with punctuation *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/sample/c/push-audio.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/push-audio/" --- # push-audio.c This example runs a recognizer where the application pushes data through the recognition pipeline. Shows VAD audio processing for use with third-party recognizers such as keyword-to-search applications. For Python push-mode audio, see [stt_push.py](https://doc.sensory.com/tnl/7.8/api/sample/python/stt_push.md#stt-push-py). ## Instructions 1. [Build](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) the sample code. In the same terminal window type the command after the `%`, then say "voice genie" a number of times. Stop the executable with `^C`. ```console % ./bin/push-audio ../../model/spot-voicegenie-enUS-6.5.1-m.snsr Recognized "voicegenie" from sample 42960 to sample 53760. Recognized "voicegenie" from sample 90720 to sample 101280. Recognized "voicegenie" from sample 141360 to sample 155040. ^C ``` 2. Enroll a custom wake word. In the same terminal window type the command after the `%`, ```console % ./bin/spot-enroll -o spot-armadillo.snsr \ -vt ../../model/udt-universal-3.67.1.0.snsr \ +armadillo ../../data/enrollments/armadillo-1-{0,1,2,3}.wav Adapting: 100% complete. Enrolled model saved to "spot-armadillo.snsr" ``` 3. Run the `push-audio` example on the custom wake word with one of the example audio files: ```console % ./bin/push-audio spot-armadillo.snsr \ ../../data/enrollments/armadillo-1-0-c.wav Recognized "armadillo" from sample 5280 to sample 15120. ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/push-audio.c_ **push-audio.c:** ```c /* Sensory Confidential * Copyright (C)2017-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK recognition from file example, where audio processing * is driven by the application, also known as push mode processing. *------------------------------------------------------------------------------ */ #include #include /* Ten second output ring buffer for optional VAD */ #define RING_BUFFER_SIZE 320000 /* Process 15 ms of 16-bit audio sampled at 16 kHz */ #define CHUNK_SIZE 480 #define TASKS_SUPPORTED\ SNSR_PHRASESPOT " 1.0.0;"\ SNSR_PHRASESPOT_VAD " 1.0.0;"\ SNSR_LVCSR " 1.0.0;"\ SNSR_VAD " 1.0.0" /* VAD endpoint event callback function. * Print the segmentation found, and return SNSR_RC_STOP to exit the main loop. */ static SnsrRC endEvent(SnsrSession s, const char *key, void *privateData) { double begin, end; snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); snsrGetDouble(s, SNSR_RES_END_MS, &end); printf("VAD found audio from %.0f ms to %.0f ms.\n", begin, end); return SNSR_RC_STOP; } /* VAD detected silence, print a message and continue. */ static SnsrRC silenceEvent(SnsrSession s, const char *key, void *privateData) { printf("VAD detected silence. Listening for trigger.\n"); return SNSR_RC_OK; } /* Result callback function, see snsrSetHandler() in main() below. * Print the result hypothesis and the start and end sample indices * for this phrase. */ static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *phrase; double begin, end; /* Retrieve the phrase text and alignments from the session handle */ snsrGetDouble(s, SNSR_RES_BEGIN_SAMPLE, &begin); snsrGetDouble(s, SNSR_RES_END_SAMPLE, &end); r = snsrGetString(s, SNSR_RES_TEXT, &phrase); /* Quit early if an error occurred. */ if (r != SNSR_RC_OK) return r; printf("Recognized \"%s\" from sample %.0f to sample %.0f.\n", phrase, begin, end); return SNSR_RC_OK; } /* Print error message and exit */ static void fatal(int rc, const char *format, ...) { va_list a; va_start(a, format); vfprintf(stderr, format, a); va_end(a); fprintf(stderr, "\n"); exit(rc); } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; SnsrStream a, out = NULL; char buffer[CHUNK_SIZE]; size_t read; if (argc != 2 && argc != 3) fatal(255, "usage: %s model [wavefile]", argv[0]); /* Create a new session handle. */ snsrNew(&s); /* Load the model task file. */ snsrLoad(s, snsrStreamFromFileName(argv[1], "r")); /* Validate model types this application supports. */ r = snsrRequire(s, SNSR_TASK_TYPE_AND_VERSION_LIST, TASKS_SUPPORTED); if (r != SNSR_RC_OK) fatal(r, "ERROR: %s", snsrErrorDetail(s)); /* Check whether an output audio channel exists, wire it in if it does */ r = snsrSetStream(s, SNSR_SINK_AUDIO_PCM, NULL); if (r == SNSR_RC_DST_CHANNEL_NOT_FOUND) { snsrClearRC(s); } else { out = snsrStreamFromBuffer(RING_BUFFER_SIZE, RING_BUFFER_SIZE); snsrRetain(out); snsrSetStream(s, SNSR_SINK_AUDIO_PCM, out); /* Register VAD endpoint callbacks. */ snsrSetHandler(s, SNSR_END_EVENT, snsrCallback(endEvent, NULL, NULL)); snsrSetHandler(s, SNSR_LIMIT_EVENT, snsrCallback(endEvent, NULL, NULL)); snsrSetHandler(s, SNSR_SILENCE_EVENT, snsrCallback(silenceEvent, NULL, NULL)); } if (argc == 2) { /* Open a stream handle on the default microphone for live audio. */ a = snsrStreamFromAudioDevice(SNSR_ST_AF_DEFAULT); } else { /* Open a stream handle on the audio file. */ a = snsrStreamFromAudioFile(argv[2], "r", SNSR_ST_AF_DEFAULT); } /* Register a result callback. Private data handle is not used. */ r = snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, NULL)); /* Pure VAD models do support SNSR_RESULT_EVENTS, ignore the error */ if (r == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); /* Main recognition loop. */ do { /* Read from the audio stream into the temporary workspace. */ read = snsrStreamRead(a, buffer, sizeof(*buffer), CHUNK_SIZE); if (snsrStreamRC(a) != SNSR_RC_OK && snsrStreamRC(a) != SNSR_RC_EOF) fatal(snsrStreamRC(a), "ERROR: %s", snsrStreamErrorDetail(a)); /* Process one block of audio. */ r = snsrPush(s, SNSR_SOURCE_AUDIO_PCM, buffer, read); /* The VAD endpoint callback returns SNSR_RC_STOP. */ if (r == SNSR_RC_STOP) break; if (r != SNSR_RC_OK) fatal(r, "ERROR: %s", snsrErrorDetail(s)); /* If this is pipeline includes a voice activity detector, * read from the ring buffer stream, then process that output */ if (out) { #define VAD_CHUNK_SIZE 2400 short samples[VAD_CHUNK_SIZE]; size_t read = snsrStreamRead(out, samples, sizeof(*samples), VAD_CHUNK_SIZE); if (read > 0) { /* samples[] now contains read VAD audio samples. */ printf("Read %u samples from VAD stream.\n", (unsigned)read); } } } while (!snsrStreamAtEnd(a)); /* Flush internal audio ring buffer, stop any session threads */ r = snsrStop(s); /* Release the session. */ snsrRelease(s); /* Release the audio stream. */ snsrRelease(a); /* Release the ring buffer (VAD output) stream */ snsrRelease(out); /* POSIX process return code. */ return r == SNSR_RC_OK || r == SNSR_RC_STOP? 0: r; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets *[VAD]: Voice Activity Detector --- source_path: "api/sample/c/snsr-edit.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-edit/" --- # snsr-edit.c This is the source code for the [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) command-line tool. ## Instructions See [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/snsr-edit.c_ **snsr-edit.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK command-line model editing utility. *------------------------------------------------------------------------------ */ #include #include #include #include #define TASK_VERSION "~0.8.0 || 1.0.0" #define DEFAULT_INIT_FILENAME "snsr-custom-init.c"; static void fatal(SnsrSession s, int rc, const char *format, ...) { va_list a; fprintf(stderr, "ERROR: "); va_start(a, format); vfprintf(stderr, format, a); va_end(a); fprintf(stderr, "\n"); snsrRelease(s); exit(rc); } static const char *usageDetail = "Settings are strings used as keys to query or change task behavior.\n" "Most frequently used are operating-point for wake words and command sets,\n" "leading-silence and trailing-silence for VAD templates,\n" "partial-result-interval for LVCSR and STT, and stt-profile for STT models.\n" "Refer to the " SNSR_NAME " SDK documentation for a complete list and\n" "descriptions of all supported settings.\n"; static void usage(const char *name) { SnsrSession s; const char *libInfo; fprintf(stderr, "Edits/modifies " SNSR_NAME " SDK .snsr model files.\n\n"); fprintf(stderr, "usage: %s -t task [options]\n" " options:\n" " -C tag-identifier : emit C source to load model into RAM\n" " -c tag-identifier : emit C source to run model from code space\n" " -e setting filename : extract task setting/slot into filename\n" " -f setting filename : load filename into task setting/slot\n" " -g setting value : load string into task setting\n" " -i : emit custom initialization code\n" " -o out : output filename\n" " -p : prune unused settings to reduce model size\n" " -q setting : query a task setting\n" " -s setting=value : override a task setting\n" " -t task : specify task filename (required)\n" " -v [-v [-v]] : increase verbosity\n", name); fprintf(stderr, "\n%s", usageDetail); snsrNew(&s); snsrGetString(s, SNSR_LIBRARY_INFO, &libInfo); fprintf(stderr, "\n%s\n", libInfo); snsrRelease(s); exit(199); } /* Report command-line argument errors. */ static void quitOnError(SnsrSession s) { SnsrRC r = snsrRC(s); if (r == SNSR_RC_NO_MODEL) fatal(s, r, "set -t task before -f, -q, or -s options"); if (r != SNSR_RC_OK) fatal(s, r, "%s", snsrErrorDetail(s)); } /* Report model license keys. */ static void reportModelLicense(SnsrSession s, const char *modelfile, int verbose) { const char *msg = NULL; if (verbose > 1) { snsrGetString(s, SNSR_MODEL_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", modelfile, msg); } msg = NULL; snsrGetString(s, SNSR_MODEL_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for model \"%s\": %s.\n", modelfile, msg); } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; int customInit = 0, o, prune = 0, verbose = 0, useRAM = 0; const char *msg = NULL, *out = NULL, *task = NULL, *tag = NULL; char *outPath = NULL; extern char *optarg; extern int optind; #ifdef SNSR_USE_SECURITY_CHIP uint32_t *securityChipComms(uint32_t *in); snsrConfig(SNSR_CONFIG_SECURITY_CHIP, securityChipComms); #endif if (argc == 1) usage(argv[0]); r = snsrNew(&s); if (r != SNSR_RC_OK) fatal(s, r, "%s", s? snsrErrorDetail(s): snsrRCMessage(r)); snsrSetString(s, SNSR_PREPARE_SUBSET_INIT, NULL); snsrSetString(s, SNSR_PRUNE_SETTINGS, "no"); while ((o = getopt(argc, argv, "aC:c:e:f:g:io:pq:s:t:v?")) >= 0) { switch (o) { case 'C': useRAM = 1; case 'c': tag = optarg; break; case 'e': if (optind >= argc) usage(argv[0]); { SnsrStream slot, out; snsrGetStream(s, optarg, &slot); quitOnError(s); out = snsrStreamFromFileName(argv[optind], "w"); snsrRetain(out); if (verbose > 1) printf("Saving setting \"%s\" into \"%s\".\n", optarg, argv[optind]); snsrStreamCopy(out, slot, SIZE_MAX); if (snsrStreamRC(out) != SNSR_RC_EOF) fatal(s, snsrStreamRC(out), "%s", snsrStreamErrorDetail(out)); snsrRelease(out); optind++; } break; case 'f': if (optind >= argc) usage(argv[0]); if (verbose > 1) printf("Loading \"%s\" into setting \"%s\".\n", argv[optind], optarg); snsrSetStream(s, optarg, snsrStreamFromFileName(argv[optind++], "r")); quitOnError(s); reportModelLicense(s, argv[optind - 1], verbose); break; case 'g': if (optind >= argc) usage(argv[0]); if (verbose > 1) printf("Loading \"%s\" into setting \"%s\".\n", argv[optind], optarg); snsrSetStream(s, optarg, snsrStreamFromString(argv[optind++])); quitOnError(s); break; case 'i': customInit = 1; if (!out) out = DEFAULT_INIT_FILENAME; break; case 'o': out = optarg; break; case 'p': prune = 1; break; case 'q': { const char *strVal = NULL; if (snsrGetString(s, optarg, &strVal) == SNSR_RC_OK) { const char *q = strVal && strchr(strVal, ' ')? "\"": ""; printf("%s = %s%s%s\n", optarg, q, strVal, q); } quitOnError(s); break; } case 's': if (verbose > 2) printf("Applying setting \"%s\".\n", optarg); snsrSet(s, optarg); quitOnError(s); break; case 't': if (verbose > 1) printf("Loading \"%s\" as the template model.\n", optarg); snsrLoad(s, snsrStreamFromFileName(optarg, "r")); if (!task) task = optarg; quitOnError(s); reportModelLicense(s, optarg, verbose); break; case 'v': verbose++; break; case '?': default: usage(argv[0]); } } r = snsrRequire(s, SNSR_TASK_VERSION, TASK_VERSION); if (r == SNSR_RC_NO_MODEL || optind != argc) usage(argv[0]); /* Report application license status */ if (verbose > 1) { snsrGetString(s, SNSR_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", argv[0], msg); } msg = NULL; snsrGetString(s, SNSR_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for \"%s\": %s.\n", argv[0], msg); snsrSetString(s, SNSR_MODEL_NAME, task); if (tag) { SnsrDataFormat fmt = SNSR_FM_SOURCE; if (!out) { char *t; if (!(out = outPath = malloc(strlen(task) + 3))) fatal(s, SNSR_RC_NO_MEMORY, "%s", snsrRCMessage(SNSR_RC_NO_MEMORY)); strcpy(outPath, task); if ((t = strrchr(outPath, '.'))) *t = '\0'; strcat(outPath, ".c"); if ((t = strrchr(outPath, '/')) || (t = strrchr(outPath, '\\'))) { *t = '\0'; out = t + 1; } } snsrSetString(s, SNSR_TAG_IDENTIFIER, tag); if (useRAM) fmt = SNSR_FM_SOURCE_RAM; else if (prune) fmt = SNSR_FM_SOURCE_PRUNED; r = snsrSave(s, fmt, snsrStreamFromFileName(out, "w")); } else if (customInit) { r = snsrSave(s, SNSR_FM_SUBSET_INIT, snsrStreamFromFileName(out, "w")); } else if (out) { r = snsrSave(s, prune? SNSR_FM_CONFIG_PRUNED: SNSR_FM_CONFIG, snsrStreamFromFileName(out, "w")); } if (r != SNSR_RC_OK) fatal(s, r, "%s", snsrErrorDetail(s)); if (out && verbose > 0) printf("Output written to \"%s\".\n", out); free(outPath); snsrRelease(s); snsrTearDown(); return 0; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/snsr-eval.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval/" --- # snsr-eval.c This is the source code for the [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval) command-line tool. The sample build also creates `snsr-eval-subset` from the same source. This is a version of `snsr-eval` that includes only the code modules required to run `spot-hbg-enUS-1.4.0-m.snsr`. It's built with `-DSNSR_USE_SUBSET` and `snsr-custom-init.c` created by [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit). See [Compile-time macros § SNSR_USE_SUBSET](https://doc.sensory.com/tnl/7.8/api/compile-macros.md#snsr-use-subset). See [src/CMakeLists.txt](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#cmakeliststxt) or [Makefile](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#makefile) for details. ## Instructions See [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/snsr-eval.c_ **snsr-eval.c:** ```c /* Sensory Confidential * * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK model evaluation command-line utility. *------------------------------------------------------------------------------ * This utility supports evaluation of many of the TrulyHandsfree/TrulyNatural * SDK task types. The source code is provided as a detailed example. * * For most use keyword spotting implementations the live-spot.c sample is a * better starting point. *------------------------------------------------------------------------------ */ #ifdef _WIN32 # include #endif #include #include #include #include #define TASKS_SUPPORTED\ SNSR_PHRASESPOT " ~0.5.0 || 1.0.0;"\ SNSR_PHRASESPOT_VAD " ~0.5.0 || 1.0.0;"\ SNSR_LVCSR " 1.0.0;"\ SNSR_VAD " 1.0.0;" \ SNSR_FEATURE " 1.0.0;" \ SNSR_FEATURE_PHRASESPOT " 1.0.0;"\ SNSR_FEATURE_LVCSR " 1.0.0;"\ SNSR_FEATURE_VAD " 1.0.0" #define DEFAULT_SAMPLE_RATE 16000 #define DEFAULT_FRAME_SIZE_MS 15 /* Used with the -i listFile batch processing option */ #define MAX_FILENAME_LENGTH 2048 #if defined(_MSC_VER) && (_MSC_VER < 1900) # define snprintf _snprintf #endif typedef struct { int nBest; /* Requested N-best results, usually 1 */ int verbose; /* Amount of detail resultEvent prints */ unsigned isPartial:1; /* 1 if this is a preliminary result */ unsigned isPhrase:1; /* 1 if this is a phrase-level iteration */ } ResultConfig; typedef struct { char *filename; /* partial output filename, e.g. out/ */ size_t prefix; /* filename directory path length */ size_t length; /* size of the filename buffer */ int verbose; } VadContext; static SnsrRC showAlignment(SnsrSession s, const char *key, void *privateData) { SnsrRC r; ResultConfig *config = (ResultConfig *)privateData; const char *phrase; const char *partial = config->isPartial? "P ": ""; double begin, end, score = -1.0, svscore; snsrGetString(s, SNSR_RES_TEXT, &phrase); snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); snsrGetDouble(s, SNSR_RES_END_MS, &end); r = snsrGetDouble(s, SNSR_RES_SCORE, &score); if (r == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); r = snsrGetDouble(s, SNSR_RES_SV_SCORE, &svscore); if (r != SNSR_RC_OK) return r; if (config->nBest > 1 && config->isPhrase && !config->isPartial) { int count = 1, index = 0; snsrGetInt(s, SNSR_RES_COUNT, &count); snsrGetInt(s, SNSR_RES_INDEX, &index); printf("%2i/%i %6.0f %6.0f %s\n", index + 1, count, begin, end, phrase); } else if (config->verbose <= -2) { if (!config->isPartial) printf("%s\n", phrase); } else if (config->verbose == -1) { printf("%s%20s", phrase, config->isPartial? "\r": "\n"); } else if (config->verbose == 0) { printf("%s%6.0f %6.0f %s\n", partial, begin, end, phrase); } else { printf("%s%6.0f %6.0f (%.4g%s) %s\n", partial, begin, end, score >= 0? score: svscore, score >= 0? "": " sv", phrase); } fflush(stdout); return r; } static SnsrRC showAvailablePoint(SnsrSession s, const char *key, void *privateData) { SnsrRC r; int point, *first = (int *)privateData; r = snsrGetInt(s, SNSR_RES_AVAILABLE_POINT, &point); if (r == SNSR_RC_OK) { if (*first) printf("Available operating points: %i", point); else printf(", %i", point); *first = 0; } return r; } static SnsrRC showVocab(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *text = NULL; int id = -1, *first = (int *)privateData; snsrGetInt(s, SNSR_RES_ID, &id); r = snsrGetString(s, SNSR_RES_TEXT, &text); if (r != SNSR_RC_OK) return r; if (*first) printf("Available vocabulary:\n"); printf(" %2i: \"%s\"\n", id, text); *first = 0; return r; } static SnsrRC entityIterator(SnsrSession s, const char *key, void *privateData) { const char *entity, *value; double score = 0; snsrGetDouble(s, SNSR_RES_NLU_ENTITY_SCORE, &score); snsrClearRC(s); /* Not all models support NLU scores, ignore errors */ snsrGetString(s, SNSR_RES_NLU_ENTITY_NAME, &entity); snsrGetString(s, SNSR_RES_NLU_ENTITY_VALUE, &value); printf("NLU entity: %s (%.4g) = %s\n", entity, score, value); return snsrRC(s); } static SnsrRC intentEvent(SnsrSession s, const char *key, void *privateData) { const char *intent, *value; double score = 0; snsrGetDouble(s, SNSR_RES_NLU_INTENT_SCORE, &score); snsrClearRC(s); /* Not all models support NLU scores, ignore errors */ snsrGetString(s, SNSR_RES_NLU_INTENT_NAME, &intent); snsrGetString(s, SNSR_RES_NLU_INTENT_VALUE, &value); printf("NLU intent: %s (%.4g) = %s\n", intent, score, value); return snsrForEach(s, SNSR_NLU_ENTITY_LIST, snsrCallback(entityIterator, NULL, privateData)); } static SnsrRC nluEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *name, *parentPath = (char *)privateData, *value; char *path; double score = 0; size_t pathLen = 0; int nluMax = 1, nBest = 1; snsrGetDouble(s, SNSR_RES_NLU_SLOT_SCORE, &score); snsrClearRC(s); /* Not all models support NLU scores, ignore errors */ snsrGetString(s, SNSR_RES_NLU_SLOT_NAME, &name); r = snsrGetString(s, SNSR_RES_NLU_SLOT_VALUE, &value); if (r != SNSR_RC_OK) return r; pathLen = strlen(SNSR_RES_NLU_SLOT_VALUE) + strlen(name) + 1; if (parentPath) pathLen += strlen(parentPath) + 1; path = malloc(pathLen); if (!path) return SNSR_RC_NO_MEMORY; if (!parentPath) { strcpy(path, SNSR_RES_NLU_SLOT_VALUE); strcat(path, name); } else { strcpy(path, parentPath); strcat(path, "."); strcat(path, name); } /* SNSR_NLU_RES_MAX introduced in 6.16.0, missing from older models */ r = snsrGetInt(s, SNSR_NLU_RES_MAX, &nluMax); if (r != SNSR_RC_OK) snsrClearRC(s); /* SNSR_RESULT_MAX introduced in 6.17.0, missing from older models */ r = snsrGetInt(s, SNSR_RESULT_MAX, &nBest); if (r != SNSR_RC_OK) snsrClearRC(s); if (nBest > 1 || nluMax > 1) { int nluIndex = 0, nluCount = 1, recIndex = 0, recCount = 0; snsrGetInt(s, SNSR_RES_COUNT, &recCount); snsrGetInt(s, SNSR_RES_INDEX, &recIndex); snsrClearRC(s); /* SNSR_RES_{COUNT,INDEX} not available before 6.17.0 */ snsrGetInt(s, SNSR_RES_NLU_COUNT, &nluCount); snsrGetInt(s, SNSR_RES_NLU_INDEX, &nluIndex); if (recCount > 1) { printf("%2i/%i NLU %2i/%i %s (%.4g) = %s\n", recIndex + 1, recCount, nluIndex + 1, nluCount, path, score, value); } else { printf("NLU %2i/%i %s (%.4g) = %s\n", nluIndex + 1, nluCount, path, score, value); } } else { printf("NLU %s (%.4g) = %s\n", path, score, value); } r = snsrForEach(s, SNSR_NLU_SLOT_LIST, snsrCallback(nluEvent, NULL, path)); free(path); return r; } static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrCallback c; ResultConfig *config = (ResultConfig *)privateData; const char *partial = config->isPartial? "P ": ""; /* Skip empty (LVCSR) results. */ if (config->nBest > 1) { const char *phrase = NULL; snsrGetString(s, SNSR_RES_TEXT, &phrase); if (phrase && !phrase[0]) return snsrRC(s); } if (config->verbose > 0 && !config->isPartial) { const char *domain = NULL; SnsrRC r = snsrGetString(s, SNSR_RES_DOMAIN, &domain); if (r != SNSR_RC_OK) snsrClearRC(s); else if (domain) printf("domain: %s\n", domain); } c = snsrCallback(showAlignment, NULL, privateData); snsrRetain(c); if (config->verbose > 1) printf("%sphrase:\n", partial); config->isPhrase = 1; snsrForEach(s, SNSR_PHRASE_LIST, c); config->isPhrase = 0; if (config->verbose > 1) { printf("%swords:\n", partial); snsrForEach(s, SNSR_WORD_LIST, c); } if (config->verbose > 2) { printf("%sphonemes:\n", partial); snsrForEach(s, SNSR_PHONE_LIST, c); } if (config->verbose > 1) { printf("\n"); fflush(stdout); } snsrRelease(c); return snsrRC(s); } /* The SNSR_ADAPT_STARTED_EVENT is called from a worker thread with * the SnsrSession argument set to NULL. */ static SnsrRC adaptStartedEvent(SnsrSession s, const char *key, void *privateData) { printf(" [%s] on worker thread\n", key); fflush(stdout); return SNSR_RC_OK; } /* Display events with sample time-stamps */ static SnsrRC showEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r; double samples, timestamp = 0; int rate = DEFAULT_SAMPLE_RATE; r = snsrGetInt(s, SNSR_SAMPLE_RATE, &rate); /* VAD task types do not include SNSR_SAMPLE_RATE support, use default */ if (r == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); r = snsrGetDouble(s, SNSR_RES_SAMPLES, &samples); if (r == SNSR_RC_OK) { timestamp = samples / rate * 1000; } else if (r == SNSR_RC_SETTING_NOT_FOUND) { double frames = 0; snsrClearRC(s); r = snsrGetDouble(s, SNSR_RES_FRAMES, &frames); timestamp = frames * DEFAULT_FRAME_SIZE_MS; } if (privateData) { const char *user = "(unknown)"; snsrGetString(s, SNSR_USER, &user); printf("%6.0f [%s] %s\n", timestamp, key, user); } else { printf("%6.0f [%s]\n", timestamp, key); } fflush(stdout); return snsrRC(s); } /* VAD start point detected */ static SnsrRC vadBeginEvent(SnsrSession s, const char *key, void *privateData) { VadContext *c = (VadContext *)privateData; if (c->verbose > 1) showEvent(s, key, NULL); if (c->filename) { SnsrStream out; double begin = 0; snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); snprintf(c->filename + c->prefix, c->length - c->prefix, "%.0f.wav", begin); out = snsrStreamFromAudioFile(c->filename, "w", SNSR_ST_AF_DEFAULT); snsrSetStream(s, SNSR_SINK_AUDIO_PCM, out); snsrSetInt(s, SNSR_PASS_THROUGH, 1); if (c->verbose > 0) { printf("Saving VAD audio to \"%s\".\n", c->filename); fflush(stdout); } } return snsrRC(s); } /* VAD silence detected */ static SnsrRC vadSilenceEvent(SnsrSession s, const char *key, void *privateData) { VadContext *c = (VadContext *)privateData; if (c->verbose > 1) showEvent(s, key, NULL); return snsrRC(s); } /* Vad endpoint event */ static SnsrRC vadEndEvent(SnsrSession s, const char *key, void *privateData) { double begin, end; VadContext *c = (VadContext *)privateData; if (c->verbose > 0) { snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); snsrGetDouble(s, SNSR_RES_END_MS, &end); printf("%6.0f %6.0f [%s] VAD speech region.\n", begin, end, key); fflush(stdout); } return snsrRC(s); } /* Optional SLM processing is about to start. */ static SnsrRC slmStartEvent(SnsrSession s, const char *key, void *privateData) { printf("SLM: "); fflush(stdout); return SNSR_RC_OK; } /* SLM partial result event, SNSR_RES_TEXT is the current next word prediction */ static SnsrRC slmPartialResultEvent(SnsrSession s, const char *key, void *privateData) { const char *txt = NULL; SnsrRC r = snsrGetString(s, SNSR_RES_TEXT, &txt); if (r != SNSR_RC_OK) return r; printf("%s", txt); fflush(stdout); return SNSR_RC_OK; } /* SLM final result event, SNSR_RES_TEXT is the entire result */ static SnsrRC slmResultEvent(SnsrSession s, const char *key, void *privateData) { printf("\n"); fflush(stdout); return SNSR_RC_OK; } static void fatal(SnsrSession s, int rc, const char *format, ...) { va_list a; fprintf(stderr, "ERROR: "); va_start(a, format); vfprintf(stderr, format, a); va_end(a); fprintf(stderr, "\n"); snsrRelease(s); exit(rc); } static void fatalSession(SnsrSession s) { fatal(s, snsrRC(s), "%s", snsrErrorDetail(s)); } static const char *usageDetail = "Use a filename of - to read headerless linear 16-bit PCM little-endian \n" "audio from stdin. If you don't specify any wave files, snsr-eval uses\n" "live audio captured from the default audio device.\n" "\n" "The -d and -o options are mutually exclusive. The output directory\n" "must be writable. Audio files created by VAD segmentation are named\n" " /.wav\n" "\n" "Settings are strings used as keys to query or change task behavior.\n" "Most frequently used are operating-point for wake words and command sets,\n" "leading-silence and trailing-silence for VAD templates,\n" "partial-result-interval for LVCSR and STT, and stt-profile for STT models.\n" "Refer to the " SNSR_NAME " SDK documentation for a complete list and\n" "descriptions of all supported settings.\n"; static void usage(const char *name) { SnsrSession s; const char *libInfo; fprintf(stderr, "Evaluates/runs " SNSR_NAME " SDK .snsr model files.\n\n"); fprintf(stderr, "usage: %s -t task [options] [wavefile ...]\n" " options:\n" " -a : Add tpl-vad-lvcsr to LVCSR and STT models\n" " -d directory : VAD audio output directory\n" " -f setting filename : load filename into task setting\n" " -g setting value : load string into task setting\n" " -i listFile : run evaluation on each filename" " in listFile\n" " -l [-l [-l]] : reduce verbosity\n" " -o out : output filename for VAD audio or" " listFile results\n" " -p [-p] : Enable pipeline profiling (experimental)\n" " -q setting : query a task setting\n" " -s setting=value : override a task setting\n" " -t task : specify task filename (required)\n" " -u filename : remove unused settings and save model" " to filename\n" " -v [-v [-v]] : increase verbosity\n", name); fprintf(stderr, "\n%s", usageDetail); snsrNew(&s); snsrGetString(s, SNSR_LIBRARY_INFO, &libInfo); fprintf(stderr, "\n%s\n", libInfo); snsrRelease(s); exit(199); } /* Report command-line argument errors. */ static void quitOnError(SnsrSession s) { SnsrRC r = snsrRC(s); if (r == SNSR_RC_NO_MODEL) fatal(s, r, "set -t task before -f or -s options"); if (r != SNSR_RC_OK) fatalSession(s); } /* Report model license keys. */ static void reportModelLicense(SnsrSession s, const char *modelfile, int verbose) { const char *msg = NULL; if (verbose > 1) { snsrGetString(s, SNSR_MODEL_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", modelfile, msg); } msg = NULL; snsrGetString(s, SNSR_MODEL_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for model \"%s\": %s.\n", modelfile, msg); } /* Show CPU required to run the model in real time. This uses timing * information gathered while running model inference. */ static void showRealTimeFactor(SnsrSession s, const char *slot) { char path[SNSR_SETTING_SZ], *o; double cpuSeconds, rtf, samplesProcessed = 0; size_t len; int sampleRate; SnsrRC r; if (!slot) o = path; else { strcpy(path, slot); len = strlen(path); if (len && path[len - 1] != '.') strcat(path, "."); o = path + strlen(path); } snsrClearRC(s); strcpy(o, SNSR_RES_CPU_SECONDS_USED); snsrGetDouble(s, path, &cpuSeconds); strcpy(o, SNSR_RES_SAMPLES_PROCESSED); snsrGetDouble(s, path, &samplesProcessed); r = snsrGetInt(s, SNSR_SAMPLE_RATE, &sampleRate); if (r != SNSR_RC_OK || !samplesProcessed) { snsrClearRC(s); return; } rtf = cpuSeconds / samplesProcessed * sampleRate; if (slot && slot[0]) { printf("%5s: %8.0f samples, %6.3f seconds, %5.2f%% rtf\n", slot, samplesProcessed, cpuSeconds, rtf * 100); } else { printf("Total: %8.0f samples, %6.3f seconds, %5.2f%% rtf\n", samplesProcessed, cpuSeconds, rtf * 100); } } /* Reads one \n-terminated line from s int out, which has space for * at least outSize characters. * Skips empty lines (ones containing just \n) and lines starting with # * Strips a single trailing \r from out. */ static SnsrRC getNextLine(SnsrStream s, char *out, size_t outSize) { SnsrRC rc; size_t read; snsrRetain(s); for (;;) { read = snsrStreamGetDelim(s, out, outSize, '\n'); rc = snsrStreamRC(s); if (rc != SNSR_RC_OK) break; out[--read] = '\0'; /* strip trailing \n */ if (read > 0 && out[read - 1] == '\r') out[--read] = '\0'; if (read && out[0] != '#') break; } snsrRelease(s); return rc; } static SnsrRC batchEntityList(SnsrSession s, const char *key, void *privateData) { SnsrStream output = (SnsrStream)privateData; const char *name, *value; double score; snsrGetString(s, SNSR_RES_NLU_ENTITY_NAME, &name); snsrGetDouble(s, SNSR_RES_NLU_ENTITY_SCORE, &score); snsrGetString(s, SNSR_RES_NLU_ENTITY_VALUE, &value); snsrStreamPrint(output, "\t%s\t%.4g\t%s", name, score, value); return snsrRC(s); } static SnsrRC batchIntentEvent(SnsrSession s, const char *key, void *privateData) { SnsrStream output = (SnsrStream)privateData; const char *name, *value; double score; snsrGetString(s, SNSR_RES_NLU_INTENT_NAME, &name); snsrGetDouble(s, SNSR_RES_NLU_INTENT_SCORE, &score); snsrGetString(s, SNSR_RES_NLU_INTENT_VALUE, &value); snsrStreamPrint(output, "\t%s\t%4g\t%s", name, score, value); snsrForEach(s, SNSR_NLU_ENTITY_LIST, snsrCallback(batchEntityList, NULL, privateData)); return snsrRC(s); } static SnsrRC batchResultEvent(SnsrSession s, const char *key, void *privateData) { SnsrStream output = (SnsrStream)privateData; double begin, end, score; const char *hyp; snsrGetDouble(s, SNSR_RES_BEGIN_MS, &begin); snsrGetDouble(s, SNSR_RES_END_MS, &end); snsrGetDouble(s, SNSR_RES_SCORE, &score); snsrGetString(s, SNSR_RES_TEXT, &hyp); snsrStreamPrint(output, "\t%.0f\t%.0f\t%.6g\t%s", begin, end, score, hyp); return snsrRC(s); } /* Batch processing mode, one audio filename per line in listFile */ static SnsrRC batch(SnsrSession s, const char *listFile, const char *outFile) { char fileName[MAX_FILENAME_LENGTH]; SnsrStream ls = NULL; SnsrStream out, nlu, result; int fileIndex = 0, fileCount = 0; SnsrRC r = SNSR_RC_OK; /* Count the number of lines in listFile */ ls = snsrStreamFromFileName(listFile, "r"); snsrRetain(ls); while ((r = getNextLine(ls, fileName, sizeof(fileName))) == SNSR_RC_OK) fileCount++; if (r != SNSR_RC_EOF) fatal(s, r, "\"%s\": %s", listFile, snsrStreamErrorDetail(ls)); snsrRelease(ls); /* Open output stream */ if (outFile) out = snsrStreamFromFileName(outFile, "w"); else out = snsrStreamFromFILE(stdout, SNSR_ST_MODE_WRITE); /* In-memory result and NLU streams, we'll copy these to the output */ result = snsrStreamFromBuffer(1<<10, 1<<20); nlu = snsrStreamFromBuffer(1<<10, 1<<20); /* Install result handlers for batch mode processing */ snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(batchResultEvent, NULL, result)); snsrSetHandler(s, SNSR_NLU_INTENT_EVENT, snsrCallback(batchIntentEvent, NULL, nlu)); /* Iterate over all audio files in listFile */ ls = snsrStreamFromFileName(listFile, "r"); snsrRetain(ls); while ((r = getNextLine(ls, fileName, sizeof(fileName))) == SNSR_RC_OK) { fileIndex++; if (outFile) { printf("\rProcessing file %i of %i, %.2f%% ", fileIndex, fileCount, (double)fileIndex / fileCount * 100); fflush(stdout); } snsrReset(s); snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, snsrStreamFromAudioFile(fileName, "r", SNSR_ST_AF_DEFAULT)); r = snsrRun(s); if (r != SNSR_RC_OK && r != SNSR_RC_STREAM_END) { fprintf(stderr, "\nWARNING: %s index %i: %s\n", listFile, fileIndex, snsrErrorDetail(s)); snsrStreamSkip(result, 1, SIZE_MAX); snsrStreamSkip(nlu, 1, SIZE_MAX); continue; } snsrStreamPrint(out, "%i\t%s", fileIndex, fileName); snsrStreamCopy(out, result, SIZE_MAX); snsrStreamCopy(out, nlu, SIZE_MAX); snsrStreamPrint(out, "\n"); } if (r == SNSR_RC_EOF) r = SNSR_RC_OK; snsrRelease(ls); snsrRelease(nlu); snsrRelease(out); snsrRelease(result); if (outFile) printf("\n"); return r; } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; SnsrStream tmp, audio = NULL; int i, o, profile = 0; int vadAdded = 0, verbose = 0; const char *dir = NULL, *list = NULL, *msg = NULL, *out = NULL, *save = NULL; extern char *optarg; extern int optind; ResultConfig full = {1, 0, 0, 0}, partial = {1, 0, 1, 0}; VadContext vadContext = {NULL, 0, 0}; #ifdef SNSR_USE_SECURITY_CHIP uint32_t *securityChipComms(uint32_t *in); snsrConfig(SNSR_CONFIG_SECURITY_CHIP, securityChipComms); #endif #ifdef _WIN32 SetConsoleOutputCP(CP_UTF8); #endif if (argc == 1) usage(argv[0]); r = snsrNew(&s); if (r != SNSR_RC_OK) fatal(s, r, "%s", s? snsrErrorDetail(s): snsrRCMessage(r)); while ((o = getopt(argc, argv, "ad:f:g:i:lo:pq:s:t:u:v?")) >= 0) { switch (o) { case 'a': { /* Load VAD model compiled into this executable */ extern SnsrCodeModel tpl_vad_lvcsr; const char *taskType = ""; r = snsrGetString(s, SNSR_TASK_TYPE, &taskType); if (r != SNSR_RC_NO_MODEL) fatal(s, r, "set the -a option before -t, " "for example %s -at task", argv[0]); snsrClearRC(s); snsrLoad(s, snsrStreamFromCode(tpl_vad_lvcsr)); quitOnError(s); vadAdded = 1; break; } case 'd': dir = optarg; break; case 'f': if (optind >= argc) usage(argv[0]); snsrSetStream(s, optarg, snsrStreamFromFileName(argv[optind++], "r")); quitOnError(s); reportModelLicense(s, argv[optind - 1], verbose); break; case 'g': if (optind >= argc) usage(argv[0]); snsrSetStream(s, optarg, snsrStreamFromString(argv[optind++])); quitOnError(s); break; case 'i': list = optarg; break; case 'l': verbose--; break; case 'o': out = optarg; break; case 'p': profile++; break; case 'q': { const char *strVal = NULL; if (snsrGetString(s, optarg, &strVal) == SNSR_RC_OK) { const char *q = strVal && strchr(strVal, ' ')? "\"": ""; printf("%s = %s%s%s\n", optarg, q, strVal, q); } quitOnError(s); break; } case 's': snsrSet(s, optarg); quitOnError(s); break; case 't': if (vadAdded) { /* The tpl-vad-lvcsr expects an LVCSR or STT model in slot 0 */ snsrSetStream(s, SNSR_SLOT_0, snsrStreamFromFileName(optarg, "r")); vadAdded = 0; } else { snsrLoad(s, snsrStreamFromFileName(optarg, "r")); } quitOnError(s); reportModelLicense(s, optarg, verbose); break; case 'u': snsrSetString(s, SNSR_PRUNE_SETTINGS, "yes"); save = optarg; break; case 'v': verbose++; break; case '?': default: usage(argv[0]); } } r = snsrRequire(s, SNSR_TASK_TYPE_AND_VERSION_LIST, TASKS_SUPPORTED); if (r == SNSR_RC_NO_MODEL) usage(argv[0]); else if (r != SNSR_RC_OK) fatalSession(s); if (list) { if (optind != argc) fatal(s, SNSR_RC_INVALID_ARG, "Use \"-i listfile\" or audio files " "on the command line, but not both."); r = batch(s, list, out); if (r != SNSR_RC_OK && r != SNSR_RC_STREAM_END) fatalSession(s); snsrRelease(s); snsrTearDown(); return 0; } if (out && dir) fatal(s, SNSR_RC_INVALID_ARG, "The -d and -o options are multually exclusive."); /* Report application license status */ if (verbose > 1) { snsrGetString(s, SNSR_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", argv[0], msg); } msg = NULL; snsrGetString(s, SNSR_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for \"%s\": %s.\n", argv[0], msg); r = snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, NULL); snsrClearRC(s); if (r == SNSR_RC_OK) { /* No audio files provided, use live audio from the * default capture device */ if (optind == argc) { const char *taskType = ""; snsrGetString(s, SNSR_TASK_TYPE, &taskType); if (!strcmp(taskType, SNSR_LVCSR)) fatal(s, SNSR_RC_ERROR, "With live audio LVCSR and STT models require " "a VAD. You can add one with the -a flag."); audio = snsrStreamFromAudioDevice(SNSR_ST_AF_DEFAULT); if (verbose > 0) { printf("Using live audio from default capture device. ^C to stop.\n"); fflush(stdout); } } else { /* Create stream concatenation of all the audio files */ audio = snsrStreamFromString(""); for (i = optind; i < argc; i++) { if (argv[i][0] == '-' && argv[i][1] == '\0') { tmp = snsrStreamFromFILE(stdin, SNSR_ST_MODE_READ); } else { tmp = snsrStreamFromAudioFile(argv[i], "r", SNSR_ST_AF_DEFAULT); } audio = snsrStreamFromStreams(audio, tmp); } } /* Wire up the audio input stream. */ snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, audio); } else { /* SNSR_SOURCE_AUDIO_PCM not found, try feature-stream */ r = snsrSetStream(s, SNSR_SOURCE_FEATURE, NULL); snsrClearRC(s); if (r == SNSR_RC_OK) { SnsrStream feature; feature = snsrStreamFromString(""); for (i = optind; i < argc; i++) { SnsrStream tmp = snsrStreamFromFileName(argv[i], "r"); feature = snsrStreamFromStreams(feature, tmp); } r = snsrSetStream(s, SNSR_SOURCE_FEATURE, feature); } else r = SNSR_RC_OK; } /* The SNSR_OPERATING_POINT setting was introduced with 5.0.0-beta.10 */ if (verbose > 1 && snsrRC(s) == SNSR_RC_OK) { int first = 1, point = 0; r = snsrGetInt(s, SNSR_OPERATING_POINT, &point); if (r == SNSR_RC_SETTING_NOT_FOUND || r == SNSR_RC_VALUE_NOT_SET) { snsrClearRC(s); } else { printf("Using operating point %i.\n", point); snsrForEach(s, SNSR_OPERATING_POINT_LIST, snsrCallback(showAvailablePoint, NULL, &first)); printf(".\n"); fflush(stdout); } } /* The SNSR_VOCAB_LIST setting was introduced with 6.7.0. */ if (verbose > 1 && snsrRC(s) == SNSR_RC_OK) { int first = 1; snsrForEach(s, SNSR_VOCAB_LIST, snsrCallback(showVocab, NULL, &first)); snsrClearRC(s); } /* Wire up the optional audio output stream. */ r = snsrSetStream(s, SNSR_SINK_AUDIO_PCM, NULL); if (r == SNSR_RC_DST_CHANNEL_NOT_FOUND) { r = SNSR_RC_OK; snsrClearRC(s); } else if (out) { r = snsrSetStream(s, SNSR_SINK_AUDIO_PCM, snsrStreamFromAudioFile(out, "w", SNSR_ST_AF_DEFAULT)); } else { /* No file specified, turn off VAD audio output. */ r = snsrSetInt(s, SNSR_PASS_THROUGH, 0); } /* Wire up the optional feature output stream. */ if (r == SNSR_RC_OK) { r = snsrSetStream(s, SNSR_SINK_FEATURE, NULL); if (r == SNSR_RC_DST_CHANNEL_NOT_FOUND) { r = SNSR_RC_OK; snsrClearRC(s); } else if (out) { r = snsrSetStream(s, SNSR_SINK_FEATURE, snsrStreamFromFileName(out, "w")); } else { /* No file specified, turn off VAD feature output. */ r = snsrSetInt(s, SNSR_PASS_THROUGH, 0); } } if (r != SNSR_RC_OK) fatalSession(s); /* SNSR_RESULT_MAX introduced in 6.17.0, missing from older models */ r = snsrGetInt(s, SNSR_RESULT_MAX, &full.nBest); if (r != SNSR_RC_OK) snsrClearRC(s); /* Handle recognition results. */ full.verbose = verbose; r = snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, &full)); /* VAD task types do not include SNSR_RESULT_EVENT support */ if (r == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); else if (r != SNSR_RC_OK) fatalSession(s); /* Partial results might not be available, ignore handler setup errors. */ partial.verbose = verbose; r = snsrSetHandler(s, SNSR_PARTIAL_RESULT_EVENT, snsrCallback(resultEvent, NULL, &partial)); if (r == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); /* VAD callback handlers. These are not supported for all task types. */ vadContext.verbose = verbose; if (dir) { size_t dirLen = strlen(dir); vadContext.length = dirLen + 32; vadContext.filename = malloc(vadContext.length); if (!vadContext.filename) fatal(s, SNSR_RC_NO_MEMORY, "Could not allocate output filename buffer"); strcpy(vadContext.filename, dir); if (!dirLen) strcat(vadContext.filename, "./"); else if (dir[dirLen - 1] != '/') strcat(vadContext.filename, "/"); vadContext.prefix = strlen(vadContext.filename); } snsrSetHandler(s, SNSR_BEGIN_EVENT, snsrCallback(vadBeginEvent, NULL, &vadContext)); snsrSetHandler(s, SNSR_SILENCE_EVENT, snsrCallback(vadSilenceEvent, NULL, &vadContext)); snsrSetHandler(s, SNSR_END_EVENT, snsrCallback(vadEndEvent, NULL, &vadContext)); snsrSetHandler(s, SNSR_LIMIT_EVENT, snsrCallback(vadEndEvent, NULL, &vadContext)); /* Ignore not-found errors for VAD handlers */ if (snsrRC(s) == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); /* Prefer NLU intent events added in TrulyNatural 7.1.0 */ if (verbose > -3) { r = snsrSetHandler(s, SNSR_NLU_INTENT_EVENT, snsrCallback(intentEvent, NULL, NULL)); if (r == SNSR_RC_SETTING_NOT_FOUND || verbose > 1) { snsrClearRC(s); /* NLU slot events were added in TrulyNatural 6.13.0. */ r = snsrSetHandler(s, SNSR_NLU_SLOT_EVENT, snsrCallback(nluEvent, NULL, NULL)); if (r == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); } } /* Events introdcued by model version 0.13.0. */ if (verbose > 1) { snsrSetHandler(s, SNSR_LISTEN_BEGIN_EVENT, snsrCallback(showEvent, NULL, NULL)); snsrSetHandler(s, SNSR_LISTEN_END_EVENT, snsrCallback(showEvent, NULL, NULL)); /* Treat these as optional, for compatibility with older spotter models. */ if (snsrRC(s) == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); } /* Continuous Adaptation spotters provide additional events. */ if (verbose > 0) { snsrSetHandler(s, SNSR_ADAPT_STARTED_EVENT, snsrCallback(adaptStartedEvent, NULL, NULL)); snsrSetHandler(s, SNSR_ADAPTED_EVENT, snsrCallback(showEvent, NULL, NULL)); snsrSetHandler(s, SNSR_NEW_USER_EVENT, snsrCallback(showEvent, NULL, (void *)1)); /* Treat these as optional as only CA spotters support them */ if (snsrRC(s) == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); } /* SLM events are optional */ if (verbose > -3) { snsrSetHandler(s, SNSR_SLM_START_EVENT, snsrCallback(slmStartEvent, NULL, NULL)); snsrSetHandler(s, SNSR_SLM_PARTIAL_RESULT_EVENT, snsrCallback(slmPartialResultEvent, NULL, NULL)); snsrSetHandler(s, SNSR_SLM_RESULT_EVENT, snsrCallback(slmResultEvent, NULL, NULL)); if (snsrRC(s) == SNSR_RC_SETTING_NOT_FOUND) snsrClearRC(s); } r = snsrRun(s); if (r != SNSR_RC_OK && r != SNSR_RC_STREAM_END) fatalSession(s); free(vadContext.filename); if (profile == 1) { showRealTimeFactor(s, NULL); showRealTimeFactor(s, SNSR_SLOT_0); showRealTimeFactor(s, SNSR_SLOT_1); showRealTimeFactor(s, "0.0."); showRealTimeFactor(s, "2.0."); } else if (profile > 1) { snsrProfile(s, snsrStreamFromFILE(stdout, SNSR_ST_MODE_WRITE)); } if (save) { snsrReset(s); snsrSave(s, SNSR_FM_CONFIG, snsrStreamFromFileName(save, "w")); quitOnError(s); if (verbose > 0) printf("Model saved to \"%s\".\n", save); } snsrRelease(s); snsrTearDown(); if (out && verbose > 0) printf("VAD audio saved to \"%s\".\n", out); return 0; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/spot-convert.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/spot-convert/" --- # spot-convert.c This is the source code for the [spot-convert](https://doc.sensory.com/tnl/7.8/tools/spot-convert.md#spot-convert) command-line tool. ## Instructions See [spot-convert](https://doc.sensory.com/tnl/7.8/tools/spot-convert.md#spot-convert). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/spot-convert.c_ **spot-convert.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK model conversion command-line utility. *------------------------------------------------------------------------------ */ #include #include #include #include #include #define EMBED_TASK_VERSION "~0.6.0 || 1.0.0" #define HEADER_NAME "-search.h" #define SEARCH_NAME "-search.bin" #define ACMODEL_NAME "-net.bin" #define FILENAME_SIZE 1023 #define KEY_SIZE 64 #define TARGET_SIZE 16 #if defined(_MSC_VER) && (_MSC_VER < 1900) # define snprintf _snprintf #endif typedef struct { const char *basename; /* output file prefix */ const char *slot; /* slot prefix */ const char *target; /* embedded target descriptor */ int fileNameInfo; /* append version and operating point to filename */ int outputC; /* true to generate C output files */ int point; /* operating point to convert */ int verbose; /* feedback verbosity */ } ConvertContext; static void fatal(int rc, const char *format, ...) { va_list a; fprintf(stderr, "ERROR: "); va_start(a, format); vfprintf(stderr, format, a); va_end(a); fprintf(stderr, "\n"); exit(rc); } /* Concatenates head, ".", and tail into key, and returns * a pointer to key. */ const char * slotKey(const char *head, const char *tail, char key[KEY_SIZE + 1]) { if (strlen(head) + strlen(tail) + 2 > KEY_SIZE) fatal(SNSR_RC_INVALID_ARG, "-q slotname prefix is too long."); strncpy(key, head, KEY_SIZE); strncat(key, ".", KEY_SIZE); strncat(key, tail, KEY_SIZE); key[KEY_SIZE - 1] = '\0'; return key + (*head == '\0'); /* skip leading . if head is empty */ } static void writeFile(SnsrStream model, ConvertContext *ctx, const char *tag, const char *pre, const char *ver, const char *prod, const char *mid, const char *post, const char *ext, const char *mode) { SnsrRC r; SnsrStream output; size_t written; char filename[FILENAME_SIZE + 1]; filename[FILENAME_SIZE] = '\0'; strncpy(filename, pre, FILENAME_SIZE); if (ctx->fileNameInfo) { if (ver) { strncat(filename, ver, FILENAME_SIZE); strncat(filename, "-", FILENAME_SIZE); } strncat(filename, mid, FILENAME_SIZE); strncat(filename, prod, FILENAME_SIZE); } strncat(filename, post, FILENAME_SIZE); strncat(filename, ext, FILENAME_SIZE); written = snsrStreamGetMeta(model, SNSR_ST_META_BYTES_WRITTEN); output = snsrStreamFromFileName(filename, mode); snsrRetain(output); snsrStreamCopy(output, model, written); r = snsrStreamRC(output); if (r != SNSR_RC_OK) fatal(r, snsrStreamErrorDetail(output)); snsrRelease(output); if (ctx->verbose > 0) { printf("wrote %s to \"%s\"\n", tag, filename); fflush(stdout); } } static SnsrRC writeEmbeddedFiles(SnsrSession s, ConvertContext *ctx, const char *slot) { SnsrRC r; SnsrStream net = NULL, sch = NULL, hdr = NULL; #define OP_SIZE 6 char op[OP_SIZE]; char kb[KEY_SIZE + 1]; char srcTarget[TARGET_SIZE + 1]; const char *tSliceVersion = NULL; int prodReady = 0; const char *key, *prod = "prod-"; r = snsrSetString(s, slotKey(slot, SNSR_EMBEDDED_TARGET, kb), ctx->target); if (r == SNSR_RC_SETTING_NOT_FOUND) fatal(snsrRC(s), "This model cannot be converted to DSP format." " (%s)", snsrErrorDetail(s)); snsrGetStream(s, slotKey(slot, SNSR_EMBEDDED_ACMODEL_STREAM, kb), &net); if (net) snsrRetain(net); snsrGetStream(s, slotKey(slot, SNSR_EMBEDDED_SEARCH_STREAM, kb), &sch); if (sch) snsrRetain(sch); snsrGetStream(s, slotKey(slot, SNSR_EMBEDDED_HEADER_STREAM, kb), &hdr); if (hdr) snsrRetain(hdr); key = slotKey(slot, SNSR_RES_MIN_EMBEDDED_VERSION, kb); snsrGetString(s, key, &tSliceVersion); if (tSliceVersion) snsrRetain(tSliceVersion); key = slotKey(slot, SNSR_RES_EMBEDDED_MODEL_PRODUCTION_READY, kb); r = snsrGetInt(s, key, &prodReady); if (r != SNSR_RC_OK) fatal(snsrRC(s), "%s", snsrErrorDetail(s)); snprintf(op, OP_SIZE, "op%02i-", ctx->point); if (ctx->verbose > 0) { printf("operating-point: %i\n", ctx->point); fflush(stdout); } if (!prodReady) prod = "dev-"; if (ctx->verbose > 0) { printf("production-ready: %s\n", prodReady? "yes": "no"); fflush(stdout); } writeFile(net, ctx, "acoustic model (bin)", ctx->basename, tSliceVersion, prod, op, "net", ".bin", "w"); snsrRelease(net); writeFile(sch, ctx, "search model (bin)", ctx->basename, tSliceVersion, prod, op, "search", ".bin", "w"); snsrRelease(sch); writeFile(hdr, ctx, "search header", ctx->basename, tSliceVersion, prod, op, "search", ".h", "wt"); snsrRelease(hdr); if (ctx->outputC) { memset(srcTarget, 0, TARGET_SIZE + 1); strncpy(srcTarget, "src:", TARGET_SIZE); strncat(srcTarget, ctx->target, TARGET_SIZE); snsrSetString(s, slotKey(slot, SNSR_EMBEDDED_TARGET, kb), srcTarget); snsrGetStream(s, slotKey(slot, SNSR_EMBEDDED_ACMODEL_STREAM, kb), &net); writeFile(net, ctx, "acoustic model (C)", ctx->basename, tSliceVersion, prod, op, "net", ".c", "wt"); snsrGetStream(s, slotKey(slot, SNSR_EMBEDDED_SEARCH_STREAM, kb), &sch); writeFile(sch, ctx, "search model (C)", ctx->basename, tSliceVersion, prod, op, "search", ".c", "wt"); } snsrRelease(tSliceVersion); return snsrRC(s); } static SnsrRC convertAllPoints(SnsrSession s, const char *key, void *data) { ConvertContext *c = (ConvertContext *)data; const char *slot = c->slot; char keyBuff[KEY_SIZE + 1]; SnsrRC r; snsrGetInt(s, slotKey(slot, SNSR_RES_AVAILABLE_POINT, keyBuff), &c->point); r = snsrSetInt(s, slotKey(slot, SNSR_OPERATING_POINT, keyBuff), c->point); if (r != SNSR_RC_OK) return r; return writeEmbeddedFiles(s, c, slot); } static const char *usageDetail = "Output filenames are determined by the model parameters:\n" " $(prefix) [-] [slot$(slotname)-] $(target)- $(version)-\n" " op$(operating-point)- {dev,prod}- {net,search}.{bin,c,h}\n" "where:\n" " prefix specified by the -p option, or taken from the filename\n" " of the task if -p isn't used.\n" " version is the oldest DSP library that can run this model.\n" " -dev- models are limited in runtime or number of recognition\n" " events and should not be used in products.\n" " -prod- models are not limited and ready for production use.\n" "\n" "Use the -o option to override this filename pattern to:\n" " $(prefix)[-]{net,search}.{bin,c,h}\n" "\n" "The -o and -a options are mutually exclusive.\n" "\n" "Output filenames are constrained to never start with \"-\"\n" "\n" "Settings are strings used as keys to query or change task behavior.\n" "Most frequently used for wake words and command sets is operating-point.\n" "Refer to the " SNSR_NAME " SDK documentation for a complete list and\n" "descriptions of all supported settings.\n"; static void usage(const char *name) { SnsrSession s; const char *libInfo; fprintf(stderr, "Converts " SNSR_NAME " SDK wake word models to " "THF Micro format.\n\n"); fprintf(stderr, "usage: %s -t task [options] target\n" " options:\n" " -a : convert all operating-points\n" " -c : create .c output (in addition to .bin)\n" " -o output : full prefix for output filenames\n" " -p output-prefix : prefix for output filenames " "(default: task-target-)\n" " -q slotname : model slot prefix\n" " -s setting=value : override a task setting\n" " -t task : set a task filename (required)\n" " -v [-v [-v]] : increase verbosity\n", name); fprintf(stderr, "\n%s", usageDetail); snsrNew(&s); snsrGetString(s, SNSR_LIBRARY_INFO, &libInfo); fprintf(stderr, "\n%s\n", libInfo); snsrRelease(s); exit(199); } /* Report model license keys that have imminent expiration dates. */ static void reportExpiringModelLicense(SnsrSession s, const char *modelfile) { const char *expWarning = NULL; snsrGetString(s, SNSR_MODEL_LICENSE_WARNING, &expWarning); if (expWarning) fprintf(stderr, "WARNING for model \"%s\": %s.\n", modelfile, expWarning); } int main(int argc, char *argv[]) { SnsrRC r; SnsrSession s; char basename[FILENAME_SIZE + 1]; char keyBuff[KEY_SIZE + 1]; int allPoints = 0, o; const char *prefix = NULL, *task = NULL; ConvertContext ctx; extern char *optarg; extern int optind; #ifdef SNSR_USE_SECURITY_CHIP uint32_t *securityChipComms(uint32_t *in); snsrConfig(SNSR_CONFIG_SECURITY_CHIP, securityChipComms); #endif if (argc == 1) usage(argv[0]); ctx.basename = basename; ctx.slot = ""; ctx.target = NULL; ctx.outputC = 0; ctx.point = 0; ctx.fileNameInfo = 1; ctx.verbose = 0; r = snsrNew(&s); if (r != SNSR_RC_OK) fatal(r, s? snsrErrorDetail(s): snsrRCMessage(r)); while ((o = getopt(argc, argv, "aco:p:q:s:t:v?")) >= 0) { switch (o) { case 'a': allPoints = 1; break; case 'c': ctx.outputC = 1; break; case 'o': prefix = optarg; ctx.fileNameInfo = 0; break; case 'p': prefix = optarg; break; case 'q': ctx.slot = optarg; break; case 's': r = snsrSet(s, optarg); if (r == SNSR_RC_NO_MODEL) fatal(r, "Set -t task before -s setting=value"); else if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); break; case 't': snsrLoad(s, snsrStreamFromFileName(optarg, "r")); snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT); r = snsrRequire(s, SNSR_TASK_VERSION, EMBED_TASK_VERSION); if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); task = optarg; reportExpiringModelLicense(s, optarg); break; case 'v': ctx.verbose++; break; case '?': default: usage(argv[0]); } } if (optind != argc - 1) usage(argv[0]); ctx.target = argv[optind]; if (allPoints && !ctx.fileNameInfo) fatal(SNSR_RC_INVALID_ARG, "The -a and -o options are mutually exclusive."); r = snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT); if (r == SNSR_RC_NO_MODEL) usage(argv[0]); /* We'll include the source filename in the header output */ r = snsrSetString(s, SNSR_MODEL_NAME, task); if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); /* Output filename prefix buffer */ basename[FILENAME_SIZE] = '\0'; if (prefix) strncpy(basename, prefix, FILENAME_SIZE); else { char *e; assert(task); if ((e = (char *)strrchr(task, '/'))) strncpy(basename, e + 1, FILENAME_SIZE); else strncpy(basename, task, FILENAME_SIZE); if ((e = strrchr(basename, '.'))) *e = '\0'; } if (*basename) strncat(basename, "-", FILENAME_SIZE); if (ctx.fileNameInfo) { char *e; size_t len; if (*ctx.slot) { len = strlen(basename); snprintf(basename + len, FILENAME_SIZE - len, "slot%s-", ctx.slot); } len = strlen(basename); snprintf(basename + len, FILENAME_SIZE - len, "%s-", ctx.target); if ((e = strchr(basename + len, ':'))) *e = '_'; } if (ctx.verbose > 1) { printf("target: %s\n", ctx.target); printf("basename: %s\n", ctx.basename); fflush(stdout); } if (allPoints) { snsrForEach(s, slotKey(ctx.slot, SNSR_OPERATING_POINT_LIST, keyBuff), snsrCallback(convertAllPoints, NULL, &ctx)); } else { /* Very old models do not have support for operating points */ if (snsrRC(s) != SNSR_RC_OK) fatal(snsrRC(s), snsrErrorDetail(s)); snsrGetInt(s, slotKey(ctx.slot, SNSR_OPERATING_POINT, keyBuff), &ctx.point); snsrClearRC(s); writeEmbeddedFiles(s, &ctx, ctx.slot); } if (snsrRC(s) != SNSR_RC_OK) fatal(snsrRC(s), snsrErrorDetail(s)); snsrRelease(s); snsrTearDown(); return 0; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/spot-data-stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data-stream/" --- # spot-data-stream.c This example runs a wake word from code space with a [custom audio stream](https://doc.sensory.com/tnl/7.8/api/sample/c/data-stream.md#data-streamc), using pull mode processing with [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run). It is a reasonable starting point for running on a small device with an RTOS. **Also see these related items:** [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider), [data-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/data-stream.md#data-streamc) For the Python custom-stream shape, see [custom_stream.py](https://doc.sensory.com/tnl/7.8/api/sample/python/custom_stream.md#custom_streampy). ## Instructions [Build](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) the sample code. In the same terminal window type the command after the `%`: ```console % ./bin/spot-data-stream Spotted "hello blue genie" from sample 6720 to sample 18000 Done, found phrase. ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/spot-data-stream.c_ **spot-data-stream.c:** ```c /* Sensory Confidential * Copyright (C)2018-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK example illustrating the use of the * sample custom data-stream (see data-stream.c) This should * be easily adaptable to a custom live-audio stream * (in case of a custom audio driver for RTOS for example.) * * The spotter model is loaded from code space. On platforms where code is * read directly from ROM, this will reduce heap requirements. *----------------------------------------------------------------------------- */ #include #include #include #include "data-stream.h" /* See spot-hbg-enUS-1.4.0-m.c */ extern SnsrCodeModel spot_hbg_enUS; /* NOTE: extern char * foo is NOT always the same as extern char foo[] */ extern unsigned char audioData[]; extern unsigned int audioDataLen; /* Result callback function, see snsrSetHandler() below. * Print the result text and the start and end sample indices of * the first spotted phrase. */ static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC rc; const char *phrase; double begin, end; /* Retrieve the phrase text and alignments from the session handle */ snsrGetDouble(s, SNSR_RES_BEGIN_SAMPLE, &begin); snsrGetDouble(s, SNSR_RES_END_SAMPLE, &end); rc = snsrGetString(s, SNSR_RES_TEXT, &phrase); /* Quit early if an error occurred. */ if (rc != SNSR_RC_OK) return rc; printf("\nSpotted \"%s\" from sample %d to sample %d\n", phrase, (int)begin, (int)end); /* This return code from the event handler sets the * return code in the SnsrSession and causes the session * to stop */ return SNSR_RC_STOP; } int main(int argc, char **argv) { SnsrRC rc; SnsrSession s = NULL; SnsrStream audioStream = NULL; rc = snsrNew(&s); if (rc != SNSR_RC_OK) { const char *err = s ? snsrErrorDetail(s) : snsrRCMessage(rc); fprintf(stderr, "Error on init: %d - %s\n", rc, err); return rc; } /* Load and validate the spotter model task from code space */ snsrLoad(s, snsrStreamFromCode(spot_hbg_enUS)); if (snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT) != SNSR_RC_OK) { fprintf(stderr, "Error loading spotter: %s\n", snsrErrorDetail(s)); return rc; } /* Register a result callback. Private data handle is not used. */ snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, NULL)); /* NOTE: Audio stream should be 16 KHz, 16 bits/sample, mono */ /* NOTE: Directly casting char to short works on little-endian only */ /* ARM and x86 are little-endian, MIPS may not be */ audioStream = streamFromData(audioData, audioDataLen, SNSR_ST_MODE_READ); snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, audioStream); /* snsrRun won't return until stopped or interrupted or end of data */ rc = snsrRun(s); switch (rc) { case SNSR_RC_OK: printf("Done, no error but no phrase.\n"); break; case SNSR_RC_STOP: printf("Done, found phrase.\n"); break; case SNSR_RC_STREAM_END: printf("Reached end of stream.\n"); break; default: printf("Unexpected return: %d\n", rc); return rc; } printf("\n"); if (s) snsrRelease(s); /* audioStream has already been released because session had * the only reference to it - so don't snsrRelease it again. */ return 0; } ``` *[API]: Application Programming Interface *[RTOS]: Real-Time Operating System *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/spot-data.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/spot-data/" --- # spot-data.c This example runs a small keyword spotter from code space. It uses a [custom memory allocator](https://doc.sensory.com/tnl/7.8/api/library-config.md#alloctlsf) to avoid calls to the system heap allocator, and reads audio data from code space to avoid file system use. Uses [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) for audio processing. This is a good starting point for applications running on small embedded devices without full operating system support. ## Instructions [Build](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#examples-cmake) the sample code. In the same terminal window type the command after the `%`: ```console % ./bin/spot-data Spotted "hello blue genie" from sample 6720 to sample 18000 Phrase spotted. ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/spot-data.c_ **spot-data.c:** ```c /* Sensory Confidential * Copyright (C)2018-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK example illustrating the simplest way * to get data from whatever source (custom RTOS audio driver perhaps) * and push the data into the input stream and get results from the session. * * Illustrates the use of a custom memory allocator to avoid runtime * allocations from the system heap, and use of a panic function to * recover from otherwise fatal memory errors. * * Similar to sample push-audio.c but even simpler and does not use * a filesystem. * * The spotter model is loaded from code space. On platforms where code is * read directly from ROM, this will reduce heap requirements. *----------------------------------------------------------------------------- */ #include #include #include #include #include /* See spot-hbg-enUS-1.4.0-m.c */ extern SnsrCodeModel spot_hbg_enUS; /* See data.c */ extern unsigned char audioData[]; extern unsigned int audioDataLen; /* Process one 15 ms audio block at a time. * Most spotters run at a 15 ms frame rate. Matching the processing block * size to the frame rate reduces live audio processing overhead to a minimum. * Larger blocks will result in fewer calls to snsrPush(), but add latency. * Smaller blocks are accumulated in snsrPush() until at least a frame's worth * is available. */ #define BLOCK_MS 15 #define SAMPLE_RATE 16000 #define BLOCK_BYTES (BLOCK_MS * SAMPLE_RATE / 1000 * sizeof(short)) /* Heap backing store, see snsrConfig(SNSR_CONFIG_ALLOC, ...) call in main. * * Set HEAP_SIZE to 100000 to trigger an out-of-memory panic and and * subsequent recovery. */ #define HEAP_SIZE 200000 static size_t HeapPool[HEAP_SIZE / sizeof(size_t)]; /* Utility, returns the lesser of a and b */ #define MIN(a, b) ((a) < (b)? (a): (b)) /* Result callback function, see snsrSetHandler() below. * Print the result text and the start and end sample indices of * the first spotted phrase. */ static SnsrRC resultEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC rc; const char *phrase; double begin, end; /* Retrieve the phrase text and alignments from the session handle */ snsrGetDouble(s, SNSR_RES_BEGIN_SAMPLE, &begin); snsrGetDouble(s, SNSR_RES_END_SAMPLE, &end); rc = snsrGetString(s, SNSR_RES_TEXT, &phrase); /* Quit early if an error occurred. */ if (rc != SNSR_RC_OK) return rc; printf("\nSpotted \"%s\" from sample %d to sample %d\n", phrase, (int)begin, (int)end); /* This return code from the event handler sets the * return code in the SnsrSession. */ return SNSR_RC_STOP; } /* Saved calling environment used by panicFunc() below */ static jmp_buf PanicJmp; /* This function handles unrecoverable errors. Here it logs a message * to the console, then does a longjmp to the very start of the application * where a recovery attempt is made. */ static void panicFunc(const char *format, va_list a) { fprintf(stderr, "\nPANIC: "); vfprintf(stderr, format, a); fprintf(stderr, "\n\n"); longjmp(PanicJmp, SNSR_RC_NO_MEMORY); } int main(int argc, char **argv) { SnsrRC rc; SnsrSession s = NULL; int jmp; unsigned i; /* Register a custom panic handler */ snsrConfig(SNSR_CONFIG_PANIC_FUNC, panicFunc); if ((jmp = setjmp(PanicJmp))) { /* Out-of-memory error occurred. Abandon heap, re-initialize */ snsrTearDown(); fprintf(stderr, "Restarting application with default allocator.\n"); } else { /* Use a custom allocator to avoid calls to malloc(), et al */ rc = snsrConfig(SNSR_CONFIG_ALLOC, snsrAllocTLSF(HeapPool, sizeof(HeapPool))); if (rc != SNSR_RC_OK) { fprintf(stderr, "Custom allocation failure: %s\n", snsrRCMessage(rc)); return rc; } } rc = snsrNew(&s); if (rc != SNSR_RC_OK) { fprintf(stderr, "ERROR: %s\n", s? snsrErrorDetail(s) : snsrRCMessage(rc)); return rc; } /* Load and validate the spotter model task from code space */ snsrLoad(s, snsrStreamFromCode(spot_hbg_enUS)); if (snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT) != SNSR_RC_OK) { fprintf(stderr, "ERROR: %s\n", snsrErrorDetail(s)); return rc; } /* Register a result callback. Private data handle is not used. */ snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(resultEvent, NULL, NULL)); /* Main loop. Push all audio data through the recognizer pipeline. */ for (i = 0; i < audioDataLen; i += BLOCK_BYTES) { rc = snsrPush(s, SNSR_SOURCE_AUDIO_PCM, audioData + i, MIN(BLOCK_BYTES, audioDataLen - i)); if (rc == SNSR_RC_STOP) { printf("Phrase spotted.\n"); fflush(stdout); snsrClearRC(s); } else if (rc != SNSR_RC_OK) { fprintf(stderr, "ERROR: %s\n", snsrErrorDetail(s)); return rc; } } /* Flush any remaining internally-buffered audio */ rc = snsrStop(s); snsrRelease(s); snsrTearDown(); return rc; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/spot-enroll.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/spot-enroll/" --- # spot-enroll.c This is the source code for the [spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll) command-line tool. ## Instructions See [spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/spot-enroll.c_ **spot-enroll.c:** ```c /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK keyword spotting command-line enrollment utility. *------------------------------------------------------------------------------ */ #include #include #include #include #define DEFAULT_OUT "enrolled-sv.snsr" #define ENROLL_TASK_VERSION "~0.10.0 || 1.0.0" typedef struct { const char *enrollfile; /* current enrollment filename */ const char **filename; /* enrollment filenames, for error messages */ const char *enrolled; /* optional enrollment context file name */ const char *adapted; /* optional adapted context file name */ const char *model; /* enrolled phrase spotter output file name */ size_t fileCount; /* number of allocated filenames */ int failed; /* number of failed enrollments */ int verbosity; /* incremented by the -v flag */ } EnrollContext; static SnsrRC doneEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; SnsrStream model = NULL, out; size_t written; r = snsrGetStream(s, SNSR_MODEL_STREAM, &model); if (r != SNSR_RC_OK) return r; written = snsrStreamGetMeta(model, SNSR_ST_META_BYTES_WRITTEN); out = snsrStreamFromFileName(e->model, "w"); snsrStreamCopy(out, model, written); r = snsrStreamRC(out); if (r != SNSR_RC_OK) snsrDescribeError(s, "%s", snsrStreamErrorDetail(out)); snsrRelease(out); if (r == SNSR_RC_OK && e->verbosity >= 1) { printf("Enrolled model saved to \"%s\"\n", e->model); fflush(stdout); } return r; } static SnsrRC adaptedEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; r = snsrSave(s, SNSR_FM_RUNTIME, snsrStreamFromFileName(e->adapted, "w")); if (r == SNSR_RC_OK && e->verbosity >= 1) { printf("Adapted enrollment context saved to \"%s\"\n", e->adapted); fflush(stdout); } return r; } static SnsrRC enrolledEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; r = snsrSave(s, SNSR_FM_RUNTIME, snsrStreamFromFileName(e->enrolled, "w")); if (r == SNSR_RC_OK && e->verbosity >= 1) { printf("Enrollment context saved to \"%s\"\n", e->enrolled); fflush(stdout); } return r; } static SnsrRC printReason(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; const char *guidance, *reason; int pass = 0; double value = 0.0, threshold = 0.0; snsrGetInt(s, SNSR_RES_REASON_PASS, &pass); if (pass) return snsrRC(s); snsrGetString(s, SNSR_RES_REASON, &reason); snsrGetString(s, SNSR_RES_GUIDANCE, &guidance); snsrGetDouble(s, SNSR_RES_REASON_VALUE, &value); snsrGetDouble(s, SNSR_RES_REASON_THRESHOLD, &threshold); if (snsrRC(s) == SNSR_RC_OK) { fprintf(stderr, " reason: %s", reason); if (e->verbosity >= 2) fprintf(stderr, " (%.2f, threshold is %.2f)", value, threshold); fprintf(stderr, "\n fix: %s\n", guidance); fflush(stdout); } return snsrRC(s); } static SnsrRC failEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; int id; r = snsrGetInt(s, SNSR_RES_ENROLLMENT_ID, &id); if (r != SNSR_RC_OK) return r; fprintf(stderr, "Enrollment from file \"%s\" failed:\n", (size_t)id < e->fileCount? e->filename[id]: e->enrollfile); printReason(s, key, privateData); if (e->verbosity >= 3) { fprintf(stderr, "\nAll failed checks:\n"); snsrForEach(s, SNSR_REASON_LIST, snsrCallback(printReason, NULL, e)); } fflush(stdout); e->failed++; return SNSR_RC_OK; } static SnsrRC passEvent(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; int id; r = snsrGetInt(s, SNSR_RES_ENROLLMENT_ID, &id); if (r != SNSR_RC_OK) return r; if ((size_t)id >= e->fileCount) { e->fileCount++; e->filename = (const char **)realloc((char **)e->filename, sizeof(*e->filename) * e->fileCount); } e->filename[id] = e->enrollfile; return SNSR_RC_OK; } static SnsrRC progEvent(SnsrSession s, const char *key, void *privateData) { SnsrRC r = SNSR_RC_OK; EnrollContext *e = (EnrollContext *)privateData; if (e->verbosity >= 1) { double progress; r = snsrGetDouble(s, SNSR_RES_PERCENT_DONE, &progress); if (r == SNSR_RC_OK) { printf("\rAdapting: %3.0f%% complete.", progress); if (progress >= 100) printf("\n"); fflush(stdout); } } return r; } static SnsrRC userIterator(SnsrSession s, const char *key, void *privateData) { EnrollContext *e = (EnrollContext *)privateData; SnsrRC r; int count, recommended; const char *user; snsrGetString(s, SNSR_USER, &user); snsrGetInt(s, SNSR_ENROLLMENT_TARGET, &recommended); r = snsrGetInt(s, SNSR_RES_ENROLLMENT_COUNT, &count); if (r == SNSR_RC_OK) { if (e->verbosity >= 2) printf("%16s: %u enrollment%s.\n", user, count, count == 1? "": "s"); if (count != recommended) fprintf(stderr, "WARNING: \"%s\" has %i enrollment%s, task recommends " "%i for optimal performance.\n", user, count, count == 1? "": "s", recommended); fflush(stdout); } return r; } static void fatal(int rc, const char *format, ...) { va_list a; fprintf(stderr, "ERROR: "); va_start(a, format); vfprintf(stderr, format, a); va_end(a); fprintf(stderr, "\n"); exit(rc); } static const char *usageDetail = "Settings are strings used as keys to query or change task behavior.\n" "Most frequently used for enrollment is accuracy, which takes a value\n" "between 0 and 1.\n" "Refer to the " SNSR_NAME " SDK documentation for a complete list and\n" "descriptions of all supported settings.\n"; static void usage(const char *name) { SnsrSession s; const char *libInfo; fprintf(stderr, "Enrolls " SNSR_NAME " SDK wake words on audio files.\n\n"); fprintf(stderr, "usage: %s -t task [options] " "[+user1 file1 [-c] file2 ...] [+user2 ...]\n" " options:\n" " -a adaptedfile : adapted enrollment context output filename\n" " -c file : recording contains trailing context\n" " -e enrolledfile : enrollment context output filename\n" " -o out : enrolled model output filename (default: " DEFAULT_OUT ")\n" " -s setting=value : override a task setting\n" " -t task : specify task filename (required)\n" " -v [-v [-v]] : increase verbosity\n", name); fprintf(stderr, "\n%s", usageDetail); snsrNew(&s); snsrGetString(s, SNSR_LIBRARY_INFO, &libInfo); fprintf(stderr, "\n%s\n", libInfo); snsrRelease(s); exit(199); } /* Report model license keys. */ static void reportModelLicense(SnsrSession s, const char *modelfile, int verbose) { const char *msg = NULL; if (verbose > 1) { snsrGetString(s, SNSR_MODEL_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", modelfile, msg); } msg = NULL; snsrGetString(s, SNSR_MODEL_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for model \"%s\": %s.\n", modelfile, msg); } /* List enrollment phrases and IDs, where available */ static SnsrRC showVocab(SnsrSession s, const char *key, void *privateData) { SnsrRC r; const char *text = NULL; int id = -1, *first = (int *)privateData; snsrGetInt(s, SNSR_RES_ID, &id); r = snsrGetString(s, SNSR_RES_TEXT, &text); if (r != SNSR_RC_OK) return r; if (*first) printf("Available vocabulary:\n"); printf(" %2i: \"%s\"\n", id, text); *first = 0; return r; } int main(int argc, char *argv[]) { EnrollContext e; SnsrRC r; SnsrSession s; int i, o, rejected = 0; const char *msg = NULL; extern char *optarg; extern int optind; const char *u = NULL; #ifdef SNSR_USE_SECURITY_CHIP uint32_t *securityChipComms(uint32_t *in); snsrConfig(SNSR_CONFIG_SECURITY_CHIP, securityChipComms); #endif if (argc == 1) usage(argv[0]); r = snsrNew(&s); if (r != SNSR_RC_OK) fatal(r, s? snsrErrorDetail(s): snsrRCMessage(r)); e.failed = 0; e.verbosity = 0; e.enrolled = NULL; e.adapted = NULL; e.model = DEFAULT_OUT; e.fileCount = 0; e.filename = NULL; while ((o = getopt(argc, argv, "+a:e:o:s:t:v?")) >= 0) { switch (o) { case 'a': e.adapted = optarg; break; case 'e': e.enrolled = optarg; break; case 'o': e.model = optarg; break; case 's': r = snsrSet(s, optarg); if (r == SNSR_RC_NO_MODEL) fatal(r, "set -t task before -s setting=value"); else if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); break; case 't': snsrLoad(s, snsrStreamFromFileName(optarg, "r")); snsrRequire(s, SNSR_TASK_TYPE, SNSR_ENROLL); r = snsrRequire(s, SNSR_TASK_VERSION, ENROLL_TASK_VERSION); if (r != SNSR_RC_OK) fatal(r, snsrErrorDetail(s)); reportModelLicense(s, optarg, e.verbosity); break; case 'v': e.verbosity++; break; case '?': default: usage(argv[0]); } } r = snsrSetInt(s, SNSR_INTERACTIVE_MODE, 0); if (r == SNSR_RC_NO_MODEL) usage(argv[0]); /* Report application license status */ if (e.verbosity > 1) { snsrGetString(s, SNSR_LICENSE_EXPIRES, &msg); if (msg) fprintf(stderr, "\"%s\": %s.\n", argv[0], msg); } msg = NULL; snsrGetString(s, SNSR_LICENSE_WARNING, &msg); if (msg) fprintf(stderr, "WARNING for \"%s\": %s.\n", argv[0], msg); snsrSetHandler(s, SNSR_DONE_EVENT, snsrCallback(doneEvent, NULL, &e)); snsrSetHandler(s, SNSR_FAIL_EVENT, snsrCallback(failEvent, NULL, &e)); snsrSetHandler(s, SNSR_PASS_EVENT, snsrCallback(passEvent, NULL, &e)); snsrSetHandler(s, SNSR_PROG_EVENT, snsrCallback(progEvent, NULL, &e)); if (e.enrolled) snsrSetHandler(s, SNSR_ENROLLED_EVENT, snsrCallback(enrolledEvent, NULL, &e)); if (e.adapted) snsrSetHandler(s, SNSR_ADAPTED_EVENT, snsrCallback(adaptedEvent, NULL, &e)); /* SNSR_VOCAB_LIST is supported for a subset of models only, ignore errors */ if (e.verbosity > 2 && snsrRC(s) == SNSR_RC_OK) { int first = 1; snsrForEach(s, SNSR_VOCAB_LIST, snsrCallback(showVocab, NULL, &first)); snsrClearRC(s); } if (optind + 1 < argc) { int enrollmentIndex = 0, idx = -1, errors; if (argv[optind][0] != '+') usage(argv[0]); for (i = optind; i < argc; i++) { if (argv[i][0] == '+') { u = argv[i] + 1; snsrSetString(s, SNSR_USER, u); } else { SnsrStream a; int hasContext; hasContext = !strcmp("-c", argv[i]); if (hasContext && ++i >= argc) usage(argv[0]); a = snsrStreamFromFileName(argv[i], "r"); e.enrollfile = argv[i]; a = snsrStreamFromAudioStream(a, SNSR_ST_AF_DEFAULT); snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, a); snsrSetInt(s, SNSR_ADD_CONTEXT, hasContext); if (e.verbosity >= 2) { printf("Enrolling user \"%s\"%s from file \"%s\".\n", u, hasContext? " with context": "", argv[i]); fflush(stdout); } errors = e.failed; if (snsrRun(s) == SNSR_RC_STREAM_END) snsrClearRC(s); snsrGetInt(s, SNSR_RES_ENROLLMENT_COUNT, &idx); if (idx == enrollmentIndex && errors == e.failed) { fprintf(stderr, "Enrollment skipped for \"%s\", amplitude too low?\n", e.enrollfile); e.failed++; } if (e.failed > errors) rejected++; enrollmentIndex = idx; } if (snsrRC(s) != SNSR_RC_OK) fatal(snsrRC(s), snsrErrorDetail(s)); } } if (rejected) fatal(100,"%u enrollment %s rejected.", rejected, rejected == 1? "file was": "files were"); snsrForEach(s, SNSR_USER_LIST, snsrCallback(userIterator, NULL, &e)); snsrSetString(s, SNSR_USER, NULL); /* end-of-enrollment marker */ snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, snsrStreamFromString("")); if (snsrRun(s) == SNSR_RC_STREAM_END) snsrClearRC(s); if (snsrRC(s) != SNSR_RC_OK) fatal(snsrRC(s), snsrErrorDetail(s)); snsrRelease(s); snsrTearDown(); free((char **)e.filename); return e.failed; } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/c/wmme-stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/c/wmme-stream/" --- # wmme-stream.c This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) implementation for [Windows Multimedia Extensions][], used for live audio capture on Windows. ## Instructions See [live-spot-stream.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot-stream.md#live-spot-streamc). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/c/src/wmme-stream.{c,h}_ **wmme-stream.h:** ```c /* Sensory Confidential * Copyright (C)2017-2026 Sensory, Inc. https://sensory.com/ * * TrulyHandsfree SDK custom stream header. See wmme-stream.c. *------------------------------------------------------------------------------ */ typedef enum { STREAM_LATENCY_LOW, /* low latency, high CPU overhead */ STREAM_LATENCY_HIGH, /* higher latency, with lower CPU overhead */ } StreamLatency; SnsrStream streamFromWMME(int devid, unsigned int rate, SnsrStreamMode mode, StreamLatency latency); ``` { data-search-exclude } **wmme-stream.c:** ```c /* Sensory Confidential * Copyright (C)2017-2026 Sensory, Inc. https://sensory.com/ * *------------------------------------------------------------------------------ * SnsrStream provider for Windows Multimedia Extensions Waveform Audio. * Currently capture-only. *------------------------------------------------------------------------------ */ #include #include #include #include #include #include #include #include #include #include "wmme-stream.h" /* Initial size of the circular capture buffer. 100 ms at 16kHz */ #define CAPTURE_MINSIZE 3200 /* Maximum size of the circular capture buffer. 10 s at 16kHz */ #define CAPTURE_MAXSIZE 320000 /* 15 ms at 16 kHz */ #define PERIOD_SIZE_LOW_LATENCY 240 /* 200 ms at 16 kHz */ #define PERIOD_SIZE_HIGH_LATENCY 3200 /* Minimum number of periods the buffer should include */ #define MIN_PERIOD_COUNT 3 /* Buffer size in ms */ #define MIN_BUFFER_MS 300 typedef struct { SnsrStream capture; /* Captured audio buffer */ const char *initErrorMsg; /* NULL if initialization was successful */ HWAVEIN in; /* Capture handle */ DWORD msgThreadId; /* Messaging thread ID */ WAVEFORMATEX format; /* Audio format selector */ WAVEHDR **audioChunk; /* Audio buffers */ size_t chunks; /* number of allocated audio buffers */ UINT devId; /* Capture device ID */ CONDITION_VARIABLE captureNotEmpty; CRITICAL_SECTION captureLock; } ProviderData; static void setAudioError(SnsrStream stream, SnsrRC rc, MMRESULT r, const char *tag) { #define ERRMSG_SIZE 512 char errbuf[ERRMSG_SIZE]; char *errmsg = errbuf; if (waveInGetErrorText(r, errmsg, ERRMSG_SIZE) == MMSYSERR_NOERROR) { snsrStream_setDetail(stream, "%s: %s", tag, errmsg); } else { snsrStream_setDetail(stream, "%s: error code %i", r); } snsrStream_setRC(stream, rc); } static void setLastError(SnsrStream b, const char *tag) { LPVOID lpMsgBuf; DWORD r = GetLastError(); char *m; FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_IGNORE_INSERTS, NULL, r, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), (LPTSTR) &lpMsgBuf, 0, NULL); if (lpMsgBuf) { m = (char *)lpMsgBuf; m[strlen(m) - 2] = '\0'; snsrStream_setDetail(b, "%s error: %s", tag, m); LocalFree(lpMsgBuf); } else { snsrStream_setDetail(b, "%s error #%i", tag, (int)r); } snsrStream_setRC(b, SNSR_RC_ERROR); } static void audioAvailable(HWAVEIN hwi, WAVEHDR *h) { MMRESULT r = waveInUnprepareHeader(hwi, h, sizeof(*h)); ProviderData *d = (ProviderData *)h->dwUser; size_t written; EnterCriticalSection(&d->captureLock); do { if (r != MMSYSERR_NOERROR) { setAudioError(d->capture, SNSR_RC_ERROR, r, "waveInUnprepareHeader"); break; } if (!d->msgThreadId) break; assert(d->format.nChannels == 1); written = snsrStreamWrite(d->capture, h->lpData, 1, h->dwBytesRecorded); if (written != h->dwBytesRecorded) { snsrStream_setRC(d->capture, SNSR_RC_BUFFER_OVERRUN); break; } r = waveInPrepareHeader(hwi, h, sizeof(*h)); if (r != MMSYSERR_NOERROR) { setAudioError(d->capture, SNSR_RC_ERROR, r, "waveInPrepareHeader"); break; } r = waveInAddBuffer(hwi, h, sizeof(*h)); if (r != MMSYSERR_NOERROR) { waveInUnprepareHeader(hwi, h, sizeof(*h)); setAudioError(d->capture, SNSR_RC_ERROR, r, "waveInAddBuffer"); } } while (0); LeaveCriticalSection(&d->captureLock); WakeConditionVariable(&d->captureNotEmpty); } static DWORD WINAPI threadProc(LPVOID lpParameter) { BOOL r; MSG m; while ((r = GetMessage(&m, (HWND)-1, 0, 0)) > 0) { switch (m.message) { case MM_WIM_DATA: audioAvailable((HWAVEIN)m.wParam, (WAVEHDR *)m.lParam); break; case WM_QUIT: case MM_WIM_CLOSE: return ERROR_SUCCESS; } } if (r < 0) return ERROR_INVALID_HANDLE; return ERROR_SUCCESS; } /*------------------------------------------------------------------------------ */ static SnsrRC streamOpen(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); HANDLE th; MMRESULT r; size_t i; if (d->initErrorMsg) { snsrStream_setDetail(b, "%s", d->initErrorMsg); return SNSR_RC_NOT_FOUND; } snsrStreamOpen(d->capture); assert(!d->msgThreadId); th = CreateThread(0, 0, (LPTHREAD_START_ROUTINE)threadProc, 0, 0, &d->msgThreadId); if (!th) { setLastError(b, "CreateThread"); return snsrStreamRC(b); } CloseHandle(th); do { r = waveInOpen(&d->in, d->devId, &d->format, (DWORD_PTR)d->msgThreadId, 0, CALLBACK_THREAD); if (r != MMSYSERR_NOERROR) { setAudioError(b, SNSR_RC_NOT_FOUND, r, "waveInOpen"); break; } for (i = 0; i < d->chunks; i++) { r = waveInPrepareHeader(d->in, d->audioChunk[i], sizeof(*d->audioChunk[i])); if (r != MMSYSERR_NOERROR) { setAudioError(b, SNSR_RC_ERROR, r, "waveInPrepareHeader"); break; } r = waveInAddBuffer(d->in, d->audioChunk[i], sizeof(*d->audioChunk[i])); if (r != MMSYSERR_NOERROR) { setAudioError(b, SNSR_RC_ERROR, r, "waveInAddBuffer"); break; } } r = waveInStart(d->in); if (r != MMSYSERR_NOERROR) setAudioError(b, SNSR_RC_ERROR, r, "waveInStart"); } while (0); if (snsrStreamRC(b) != SNSR_RC_OK) { if (d->in) { waveInClose(d->in); d->in = NULL; } if (d->msgThreadId) { PostThreadMessage(d->msgThreadId, WM_QUIT, 0, 0); d->msgThreadId = 0; } } return snsrStreamRC(b); } static SnsrRC streamClose(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); size_t i; /* Shut down the messaging thread. */ PostThreadMessage(d->msgThreadId, WM_QUIT, 0, 0); d->msgThreadId = 0; /* Unprepare all headers. */ waveInReset(d->in); for (i = 0; i < d->chunks; i++) { WAVEHDR *h = d->audioChunk[i]; while (waveInUnprepareHeader(d->in, h, sizeof(*h)) == WAVERR_STILLPLAYING) Sleep(10); } while (waveInClose(d->in) == WAVERR_STILLPLAYING) Sleep(10); /* Flush the capture buffer */ snsrStreamSkip(d->capture, 1, CAPTURE_MAXSIZE); snsrStream_setRC(d->capture, SNSR_RC_OK); return snsrStreamRC(b); } static void streamRelease(SnsrStream b) { ProviderData *d = (ProviderData *)snsrStream_getData(b); size_t i; snsrRelease(d->capture); if (d->audioChunk) { for (i = 0; i < d->chunks; i++) { if (d->audioChunk[i]) free(d->audioChunk[i]->lpData); free(d->audioChunk[i]); } free(d->audioChunk); } free((void *)d->initErrorMsg); free(d); } static size_t streamRead(SnsrStream b, void *buffer, size_t size) { SnsrRC r; ProviderData *d = (ProviderData *)snsrStream_getData(b); size_t read = 0; EnterCriticalSection(&d->captureLock); do { read += snsrStreamRead(d->capture, (char *)buffer + read, 1, size - read); r = snsrStreamRC(d->capture); } while ((r == SNSR_RC_OK || r == SNSR_RC_EOF) && read < size && SleepConditionVariableCS(&d->captureNotEmpty, &d->captureLock, INFINITE)); if (r != SNSR_RC_OK) { snsrStream_setRC(b, r); snsrStream_setDetail(b, "%s", snsrStreamErrorDetail(d->capture)); } else if (read < size) { snsrStream_setRC(b, SNSR_RC_EOF); } LeaveCriticalSection(&d->captureLock); return read; } static SnsrStream_Vmt ProviderDef = { "WMME audio capture", &streamOpen, &streamClose, &streamRelease, &streamRead, NULL }; SnsrStream streamFromWMME(int deviceId, unsigned int rate, SnsrStreamMode mode, StreamLatency latency) { SnsrStream b; ProviderData *d = (ProviderData *)malloc(sizeof(*d)); size_t chunkSize = 0, i; if (!d) return NULL; memset(d, 0, sizeof(*d)); b = snsrStream_alloc(&ProviderDef, d, 1, 0); if (!b) { free(d); return NULL; } do { d->devId = deviceId == -1? WAVE_MAPPER: (UINT)deviceId; d->capture = snsrStreamFromBuffer(CAPTURE_MINSIZE, CAPTURE_MAXSIZE); if (!d->capture) { snsrStream_setRC(b, SNSR_RC_NO_MEMORY); break; } snsrRetain(d->capture); if (mode != SNSR_ST_MODE_READ) { snsrStream_setRC(b, SNSR_RC_INVALID_MODE); break; } /* Signalling and mutexes */ InitializeCriticalSection(&d->captureLock); InitializeConditionVariable(&d->captureNotEmpty); /* Prepare capture format description */ d->format.wFormatTag = WAVE_FORMAT_PCM; d->format.wBitsPerSample = 16; d->format.nChannels = 1; d->format.nSamplesPerSec = rate; d->format.nBlockAlign = d->format.nChannels * d->format.wBitsPerSample / 8; d->format.nAvgBytesPerSec = d->format.nBlockAlign * d->format.nSamplesPerSec; d->format.cbSize = 0; /* Allocate buffers */ switch (latency) { case STREAM_LATENCY_LOW: chunkSize = PERIOD_SIZE_LOW_LATENCY; break; case STREAM_LATENCY_HIGH: chunkSize = PERIOD_SIZE_HIGH_LATENCY; break; } d->chunks = (int)(MIN_BUFFER_MS * rate / 1000.0 / chunkSize + 0.5); if (d->chunks < MIN_PERIOD_COUNT) d->chunks = MIN_PERIOD_COUNT; d->audioChunk = malloc(d->chunks * sizeof(*d->audioChunk)); if (!d->audioChunk) { snsrStream_setRC(b, SNSR_RC_NO_MEMORY); break; } memset(d->audioChunk, 0, d->chunks * sizeof(*d->audioChunk)); for (i = 0; i < d->chunks; i++) { d->audioChunk[i] = malloc(sizeof(**d->audioChunk)); if (!d->audioChunk[i]) { snsrStream_setRC(b, SNSR_RC_NO_MEMORY); break; } memset(d->audioChunk[i], 0, sizeof(**d->audioChunk)); d->audioChunk[i]->dwBufferLength = (DWORD)(chunkSize * sizeof(short) * d->format.nChannels); d->audioChunk[i]->lpData = malloc(d->audioChunk[i]->dwBufferLength); if (!d->audioChunk[i]->lpData) { snsrStream_setRC(b, SNSR_RC_NO_MEMORY); break; } d->audioChunk[i]->dwUser = (DWORD_PTR)d; } } while (0); if (snsrStreamRC(b) != SNSR_RC_OK) d->initErrorMsg = _strdup(snsrStreamErrorDetail(b)); return b; } ``` [Windows Multimedia Extensions]: https://learn.microsoft.com/en-us/windows/win32/api/mmeapi/nf-mmeapi-waveinopen "waveInOpen" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[WMME]: Windows Multimedia Extensions, the audio capture API on Windows --- source_path: "api/sample/ios/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/ios/" --- # iOS examples The iOS sample programs are available in _sample/ios/_ in the TrulyNatural installation directory. See _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/ios/_ New to the Session API? Start with [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program) for the iOS C-via-Swift wake-word flow, then explore the sample below. ## Examples [PhraseSpot](https://doc.sensory.com/tnl/7.8/api/sample/ios/phrasespot.md#ios-ps) - Runs a wake word recognizer and shows the results in a text window. *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/ios/phrasespot.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/ios/phrasespot/" --- # PhraseSpot This Swift application runs a phrase spotter and shows the results in a text window. ## Instructions Open _sample/ios/PhraseSpot/PhraseSpot.xcodeproj_ in [Xcode][], then choose **Run** from the **Product** menu. To run on a real device, select your Team in `General > Signing` and change `Identity > Bundle Identifier` to match your development domain. The app starts listening for "hello blue genie" upon startup. Say this trigger phrase to see the recognizer response. ## Code This application uses the native C API of the TrulyNatural SDK by using a Swift [bridging header][]. Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/ios/PhraseSpot/PhraseSpot/PhraseSpot.swift_ **PhraseSpot.swift:** ```swift // // PhraseSpot.swift // PhraseSpot // // Copyright © 2018-2026 Sensory, Inc. https://sensory.com/ // All rights reserved. // import Foundation import AVFoundation protocol PhraseSpotDelegate: AnyObject { func recogniserWillStart() func recognizerDidStop(code: SnsrRC, message: String) func recognizerDidSpot(text: String, beginMs: Double, endMs: Double) } enum PhraseSpotError: Error { case api(code: SnsrRC, message: String) } class PhraseSpot { //MARK: Nested classes enum State { case stopped, started, paused } //MARK: Private properties private var session: SnsrSession? private var audio: SnsrStream? private var rC: SnsrRC = SNSR_RC_OK //MARK: Properties var libraryInfo: String? weak var delegate: PhraseSpotDelegate? var state: State = .stopped { didSet { if state == .started { startRecog() } } } //MARK: Initialization init(modelName: String) throws { try load(modelName: modelName) } deinit { if state == .started { stop() } release(&session) release(&audio) } //MARK: Private methods // C library wrappers, for convenience private func release(_ ptr: inout Optional) { snsrRelease(UnsafeRawPointer(ptr)) ptr = nil } private func retain(_ ptr: Optional) { snsrRetain(UnsafeRawPointer(ptr)) } // Find a model in the applications main bundle. private func modelPath(_ modelName: String) -> String { guard let path = Bundle.main.path(forResource: modelName, ofType: "snsr", inDirectory: "models") else { return modelName } return path } // Create and throw a PhraseSpotError.api private func throwIfError(_ session: SnsrSession?) throws { let rc = snsrRC(session) if (rc != SNSR_RC_OK) { let msg = String(cString: snsrErrorDetail(session)) print(msg) throw PhraseSpotError.api(code: rc, message: msg) } } private func load(modelName: String) throws { snsrNewIncludeOSS(&session, SNSR_VERSION) try throwIfError(session) var libInfo: UnsafePointer? snsrGetString(session, SNSR_LIBRARY_INFO, &libInfo) try throwIfError(session) libraryInfo = String(cString: libInfo!) snsrLoad(session, snsrStreamFromFileName(modelPath(modelName), "r")) snsrRequire(session, SNSR_TASK_TYPE, SNSR_PHRASESPOT) snsrRequire(session, SNSR_TASK_VERSION, "~0.5.0 || 1.0.0") // Convert self into a pointer to pass to the C library let selfPtr = UnsafeMutableRawPointer(Unmanaged.passUnretained(self).toOpaque()) // Report recognition results with a delegate.recognizerDidSpot() snsrSetHandler(session, SNSR_RESULT_EVENT, snsrCallback({ (session, key, selfPtr) -> SnsrRC in let my = Unmanaged.fromOpaque(selfPtr!).takeUnretainedValue() var text: UnsafePointer? var beginMs: Double = 0 var endMs: Double = 0 snsrGetString(session, SNSR_RES_TEXT, &text) snsrGetDouble(session, SNSR_RES_BEGIN_MS, &beginMs) snsrGetDouble(session, SNSR_RES_END_MS, &endMs) let spot = String(cString: text!) DispatchQueue.main.sync { my.delegate?.recognizerDidSpot(text: spot, beginMs: beginMs, endMs: endMs) } return SNSR_RC_OK }, nil, selfPtr)) // Stop background recognition if the state changes from .started snsrSetHandler(session, SNSR_SAMPLES_EVENT, snsrCallback({ (session, key, selfPtr) -> SnsrRC in let my = Unmanaged.fromOpaque(selfPtr!).takeUnretainedValue() return my.state == .started ? SNSR_RC_OK : SNSR_RC_STOP }, nil, selfPtr)) // Allow Bluetooth headsets try AVAudioSession.sharedInstance() .setCategory(.playAndRecord, options: AVAudioSession.CategoryOptions.allowBluetooth) // Live audio audio = snsrStreamFromDefaultAudioDevice() retain(audio) snsrSetStream(session, SNSR_SOURCE_AUDIO_PCM, audio) try throwIfError(session) } // Run phrase spotter on a background thread private func startRecog() { snsrClearRC(session) self.delegate?.recogniserWillStart() DispatchQueue.global(qos: .background).async { let code = snsrRun(self.session) // Stop recording when we are not spotting snsrStreamClose(self.audio) let msg = String(cString: snsrErrorDetail(self.session)) DispatchQueue.main.sync { self.delegate?.recognizerDidStop(code: code, message: msg) } } } //MARK: Public methods // Change scalar SnsrSession settings func set(_ key: String, _ value: Double) throws { let code = snsrSetInt(session, key, Int32(value)) if code == SNSR_RC_INCORRECT_SETTING_TYPE { snsrClearRC(session) snsrSetDouble(session, key, value) } try throwIfError(session) } // Start the phrase spotter func start() { if state != .started { state = .started } } // Stop the recognizer func stop() { if state != .stopped { state = .stopped } } // Pause a running spotter func pause() { if state == .started { state = .paused } } // Resume a paused spotter func resume() { if state == .paused { state = .started } } } ``` [bridging header]: https://developer.apple.com/documentation/swift/importing-objective-c-into-swift "Importing Objective-C into Swift" [Xcode]: https://developer.apple.com/xcode/ "Xcode enables you to develop, test, and distribute apps for all Apple platforms" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/java/SnsrEnrollmentTest.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/java/SnsrEnrollmentTest/" --- # SnsrEnrollmentTest.java This file contains UDT enrollment and evaluation unit tests. It shows how to remove an enrollment from an enrollment context loaded from file. ## Instructions To run these tests, open a terminal window and enter the commands after the `%` prompt below (on Windows, replace `./gradlew` with `gradlew.bat`). ```console % cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/ % ./gradlew test BUILD SUCCESSFUL in 7s 6 actionable tasks: 6 executed ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/src/test/java/SnsrEnrollmentTest.java_ **SnsrEnrollmentTest.java:** ```java /* Sensory Confidential * Copyright (C)2017-2026 Sensory, Inc. https://sensory.com/ * * Unit tests for UDT enrollment and generated spotter tasks. *------------------------------------------------------------------------------ */ package com.sensory.speech.snsr.test; import java.io.IOException; import java.util.*; import org.junit.Test; import static org.junit.Assert.*; import org.junit.*; import org.junit.runners.MethodSorters; import com.sensory.speech.snsr.*; import enroll.BuildConfig; @FixMethodOrder(MethodSorters.NAME_ASCENDING) public class SnsrEnrollmentTest { final String EnrollmentContext = BuildConfig.MODEL_DIR + "/test-enrollment-context.snsr"; final String SvModel1 = BuildConfig.MODEL_DIR + "/test-1.snsr"; final String SvModel2 = BuildConfig.MODEL_DIR + "/test-2.snsr"; final String[] Users = { "armadillo-1", "jackalope-1" }; final String[] TestUsers = { "armadillo-1", "armadillo-6", "jackalope-1", "jackalope-4", "terminator-2", "terminator-6" }; // Enroll two users from file. // Save the adapted enrollment context to file. // This test has to run before enrollFromContext* @Test public void EnrollFromFile() { SnsrSession s = new SnsrSession(); try { s .load(BuildConfig.UDT_MODEL) .require(Snsr.TASK_TYPE, Snsr.ENROLL) .setHandler(Snsr.DONE_EVENT, (session, key) -> { // Save enrolled model for further testing SnsrStream out = SnsrStream.fromFileName(SvModel2, "w"); try { out.copy(session.getStream(Snsr.MODEL_STREAM)); out.close(); } catch (IOException e) { fail(e.toString()); } System.out.println("Enrolled model saved to " + SvModel2); return SnsrRC.OK; }) .setHandler(Snsr.PROG_EVENT, (session, key) -> { double p = session.getDouble(Snsr.RES_PERCENT_DONE); System.out.println(String.format("Adapting: %3.0f%% done.", p)); return SnsrRC.OK; }) // Prepare for non-interactive enrollment .setInt(Snsr.INTERACTIVE_MODE, 0); // Enroll example users for (String tag: Users) { s.setString(Snsr.USER, tag); for (int i = 0; i < 4; i++) { final String path = String.format("%s/%s-%d.wav", BuildConfig.ENROLLMENT_DIR, tag, i); SnsrStream a = SnsrStream.fromAudioFile(path, "r"); s .setStream(Snsr.SOURCE_AUDIO_PCM, a) .run(); assertEquals(SnsrRC.STREAM_END, s.rC()); System.out.println("Enrolled " + tag + " with " + path); } } // List enrolled users s.forEach(Snsr.USER_LIST, (session, key) -> { System.out.println("Enrolled: " + session.getString(Snsr.USER)); return SnsrRC.OK; }); // End-of-enrollment markers s .setString(Snsr.USER, null) .setStream(Snsr.SOURCE_AUDIO_PCM, SnsrStream.fromString("")); assertEquals(SnsrRC.OK, s.rC()); s.run(); assertEquals(SnsrRC.STREAM_END, s.rC()); // Save adapted enrollment context s.save(SnsrDataFormat.RUNTIME, EnrollmentContext); } catch (IOException e) { fail(e.toString()); } s.release(); } // Load the adapted enrollment context created by "enrollment" above. @Test public void enrollFromContext() { SnsrSession s = new SnsrSession(); try { s .load(BuildConfig.UDT_MODEL) .require(Snsr.TASK_TYPE, Snsr.ENROLL) // Load the enrollment context after loading the primary model .load(EnrollmentContext) // Prepare for non-interactive enrollment .setInt(Snsr.INTERACTIVE_MODE, 0); // List enrolled users final List enrolledUsers = new ArrayList(); s.forEach(Snsr.USER_LIST, (session, key) -> { enrolledUsers.add(session.getString(Snsr.USER)); return SnsrRC.OK; }); assertEquals(Arrays.toString(Users), enrolledUsers.toString()); // End-of-enrollment markers s .setString(Snsr.USER, null) .setStream(Snsr.SOURCE_AUDIO_PCM, SnsrStream.fromString("")); assertEquals(SnsrRC.OK, s.rC()); s.run(); assertEquals(SnsrRC.STREAM_END, s.rC()); } catch (IOException e) { fail(e.toString()); } s.release(); } // Load the adapted enrollment context created by "enrollment" above. // Remove one user, re-adapt. @Test public void enrollFromContextRemoveOne() { SnsrSession s = new SnsrSession(); try { s .load(BuildConfig.UDT_MODEL) .require(Snsr.TASK_TYPE, Snsr.ENROLL) // Load the enrollment context after loading the primary model .load(EnrollmentContext) // Prepare for non-interactive enrollment .setInt(Snsr.INTERACTIVE_MODE, 0) // Save enrolled model for further testing .setHandler(Snsr.DONE_EVENT, (session, key) -> { SnsrStream out = SnsrStream.fromFileName(SvModel1, "w"); try { out.copy(session.getStream(Snsr.MODEL_STREAM)); out.close(); } catch (IOException e) { fail(e.toString()); } System.out.println("Enrolled model saved to " + SvModel1); return SnsrRC.OK; }) // Remove the first enrollment .setString(Snsr.DELETE_USER, Users[0]); // List enrolled users final List enrolledUsers = new ArrayList(); s.forEach(Snsr.USER_LIST, (session, key) -> { enrolledUsers.add(session.getString(Snsr.USER)); return SnsrRC.OK; }); assertEquals(Arrays.toString(Arrays.copyOfRange(Users, 1, Users.length)), enrolledUsers.toString()); // End-of-enrollment markers s .setString(Snsr.USER, null) .setStream(Snsr.SOURCE_AUDIO_PCM, SnsrStream.fromString("")); assertEquals(SnsrRC.OK, s.rC()); s.run(); assertEquals(SnsrRC.STREAM_END, s.rC()); } catch (IOException e) { fail(e.toString()); } s.release(); } // Load the adapted enrollment context created by "enrollment" above. // Remove two users, re-adapt. @Test public void enrollFromContextRemoveTwo() { SnsrSession s = new SnsrSession(); // Holder for DONE_EVENT count final int[] doneEventCount = new int[1]; try { s .load(BuildConfig.UDT_MODEL) .require(Snsr.TASK_TYPE, Snsr.ENROLL) // Load the enrollment context after loading the primary model .load(EnrollmentContext) // Prepare for non-interactive enrollment .setInt(Snsr.INTERACTIVE_MODE, 0) .setHandler(Snsr.DONE_EVENT, (session, key) -> { // Invoked for each deleted user doneEventCount[0]++; return SnsrRC.STOP; }); // Remove the first enrollment doneEventCount[0] = 0; s .setString(Snsr.DELETE_USER, Users[0]) .setString(Snsr.DELETE_USER, Users[1]); assertEquals(SnsrRC.STOP, s.rC()); assertEquals(doneEventCount[0], 2); // List enrolled users final List enrolledUsers = new ArrayList(); s.forEach(Snsr.USER_LIST, (session, key) -> { enrolledUsers.add(session.getString(Snsr.USER)); return SnsrRC.OK; }); assertEquals(Arrays.toString(Arrays.copyOfRange(Users, 2, Users.length)), enrolledUsers.toString()); // End-of-enrollment markers s .setString(Snsr.USER, null) .setStream(Snsr.SOURCE_AUDIO_PCM, SnsrStream.fromString("")); assertEquals(SnsrRC.OK, s.rC()); s.run(); assertEquals(SnsrRC.STREAM_END, s.rC()); } catch (IOException e) { fail(e.toString()); } s.release(); } // Returns SnsrStream concatenation of test files. private SnsrStream testAudioStream() { SnsrStream c = SnsrStream.fromString(""); for (String tag: TestUsers) { for (int i = 4; i <= 5; i++) { final String path = String.format("%s/%s-%d.wav", BuildConfig.ENROLLMENT_DIR, tag, i); c = SnsrStream.fromStreams(c, SnsrStream.fromAudioFile(path, "r")); } } return c; } // Evaluate model created by enrollFromContextRemoveOne() // Just jackalope-1 should spot, as armadillo-1 was removed. @Test public void evalModelOneUser() { SnsrSession s = new SnsrSession(); final List result = new ArrayList(); try { s .load(SvModel1) .require(Snsr.TASK_TYPE, Snsr.PHRASESPOT) .setHandler(Snsr.RESULT_EVENT, (session, key) -> { result.add(session.getString(Snsr.RES_TEXT)); return SnsrRC.OK; }) .setStream(Snsr.SOURCE_AUDIO_PCM, testAudioStream()) .run() .release(); } catch (IOException e) { fail(e.toString()); } assertEquals("[jackalope-1, jackalope-1]", result.toString()); } // Evaluate model created by EnrollFromContext(). // Both armadillo-1 and jackalope-1 should spot. @Test public void evalModelTwoUsers() { SnsrSession s = new SnsrSession(); final List result = new ArrayList(); try { s .load(SvModel2) .require(Snsr.TASK_TYPE, Snsr.PHRASESPOT) .setHandler(Snsr.RESULT_EVENT, (session, key) -> { result.add(session.getString(Snsr.RES_TEXT)); return SnsrRC.OK; }) .setStream(Snsr.SOURCE_AUDIO_PCM, testAudioStream()) .run() .release(); } catch (IOException e) { fail(e.toString()); } assertEquals("[armadillo-1, armadillo-1, jackalope-1, jackalope-1]", result.toString()); } } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/java/SnsrStreamAudioDeviceGeneric.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/java/SnsrStreamAudioDeviceGeneric/" --- # SnsrStreamAudioDeviceGeneric.java This is the source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) implementation for Java. It provides a [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) adapter for [Java Audio][]. **Also see these related items:** [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider) For the Python live-audio sample, see [live_audio.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_audio.md#live_audiopy). ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/misc/SnsrStreamAudioDeviceGeneric.java_ **SnsrStreamAudioDeviceGeneric.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * Generic audio recording to read-only SnsrStream adapter. *------------------------------------------------------------------------------ */ package com.sensory.speech.snsr; import java.io.IOException; import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioSystem; import javax.sound.sampled.Line; import javax.sound.sampled.LineUnavailableException; import javax.sound.sampled.Mixer; import javax.sound.sampled.TargetDataLine; import com.sensory.speech.snsr.SnsrStream; /* * Implements the SnsrStream.Provider interface for live audio. * * Create a new SnsrStream instance with: * SnsrStream a = SnsrStream.fromProvider(new SnsrStreamAudioDeviceGeneric(16000), * SnsrStreamMode.READ); */ class SnsrStreamAudioDeviceGeneric implements SnsrStream.Provider { private TargetDataLine mInput; private AudioFormat mAudioFormat; public SnsrStreamAudioDeviceGeneric(int sampleRate) { mInput = getDefaultMicrophone(); mAudioFormat = new AudioFormat((float) sampleRate, 16, 1, true, false); } @Override public long onOpen() throws IOException { if (mInput == null) return NOT_OPEN; try { mInput.open(mAudioFormat); mInput.start(); } catch (LineUnavailableException e) { throw new IOException(e.toString()); } return OK; } @Override public long onClose() throws IOException { mInput.stop(); mInput.close(); return OK; } @Override public void onRelease() { mInput = null; } @Override public long onRead(byte[] buffer) throws IOException { long read = mInput.read(buffer, 0, buffer.length); if (Thread.interrupted()) return INTERRUPTED; return read; } @Override public long onWrite(byte[] buffer) throws IOException { return NOT_IMPLEMENTED; } /* Find the default system microphone */ private TargetDataLine getDefaultMicrophone() { Mixer.Info[] mixers = AudioSystem.getMixerInfo(); for (Mixer.Info mixerInfo : mixers) { Mixer m = AudioSystem.getMixer(mixerInfo); try { m.open(); m.close(); } catch (Exception e) { continue; } Line.Info[] lines = m.getTargetLineInfo(); for (Line.Info l : lines) { try { TargetDataLine t = (TargetDataLine) AudioSystem.getLine(l); if (t!= null) return t; } catch (Exception e) { /* ignore */ } } } return null; } } ``` [Java Audio]: https://docs.oracle.com/javase/tutorial/sound/capturing.html "Audio capturing in Java" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/java/enrollUDT.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT/" --- # enrollUDT.java This example shows how to enroll a user-defined wake word (UDT, trigger, key word spotter). For Python UDT enrollment, see [live_enroll.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_enroll.md#live_enrollpy). ## Instructions To run this example, choose a wake word phrase, open a terminal window and enter the commands after the `%` prompt below (on Windows, replace `./gradlew` with `gradlew.bat`). Speak when prompted. ```console % cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/ % ./gradlew -q --console=plain enroll Please say your enrollment phrase (1/4) Recording: 3.80 s Recording passes preliminary tests. Please say your enrollment phrase (2/4) Recording: 3.60 s Recording passes preliminary tests. Please say your enrollment phrase (3/4) with context, for example: " will it rain tomorrow?" Recording: 5.01 s Recording passes preliminary tests. Please say your enrollment phrase (4/4) with context, for example: " will it rain tomorrow?" Recording: 4.31 s Recording passes preliminary tests. Adapting: 100% done. Enrollment context saved to ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/build/model/enrollment-context.snsr Enrolled model saved to ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/build/model/enrolled-sv.snsr Done! ``` You can make additional enrollments by specifying a unique phrase tag on the command line. New enrollments replace previous ones that used the same `tag`. ```console % ./gradlew -q --console=plain enroll -Ptag=second-phrase ``` Use the `eval` target to test the wake word enrollment(s) (**Also see these related items:** [evalUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/evalUDT.md#evaludtjava)). Stop the process with `^C` when you're done. ```console % ./gradlew -q --console=plain eval Say your enrolled phrase. #00 "custom-phrase", score = 0.817 [12795 ms, 13875 ms] custom-phrase Recording: 16.91 ^C ``` To start over, remove the existing enrollments: ```console % ./gradlew -q --console=plain clean ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/src/main/java/com/sensory/speech/snsr/demo/enroll-udt/enrollUDT.java_ **enrollUDT.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * Command-line User-Defined Trigger enrollment. *------------------------------------------------------------------------------ */ import java.io.Console; import java.io.IOException; import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrDataFormat; import com.sensory.speech.snsr.SnsrRC; import com.sensory.speech.snsr.SnsrSession; import com.sensory.speech.snsr.SnsrStream; import enroll.BuildConfig; public class enrollUDT { public static void main(String argv[]) { final int SAMPLE_RATE = 16000; final String SpeakNow = "\nPlease say your enrollment phrase"; final String EnrollmentContext = BuildConfig.MODEL_DIR + "/enrollment-context.snsr"; String userTag = "custom-phrase"; if (argv.length == 1) userTag = argv[0]; else if (argv.length != 0) { System.out.println("usage: ./gradlew enroll [-Ptag=user-or-phrase-tag]"); System.exit(7); } // Live audio stream handle. SnsrStream audio = SnsrStream.fromAudioDevice(); // Primary TrulyHandsfree session handle. SnsrSession s = new SnsrSession(); try { s.load(BuildConfig.UDT_MODEL).require(Snsr.TASK_TYPE, Snsr.ENROLL); } catch (IOException e) { e.printStackTrace(); System.exit(3); } try { s.load(EnrollmentContext); try { s.setString(Snsr.DELETE_USER, userTag); } catch (Exception e) { } System.out.println("Loaded enrollments from " + EnrollmentContext); s.forEach(Snsr.USER_LIST, (ses, key) -> { System.out.println("User " + ses.getString(Snsr.USER) + " has " + ses.getInt(Snsr.RES_ENROLLMENT_COUNT) + " enrollments."); return SnsrRC.OK; }); } catch (IOException e) { // ignore } s.setStream(Snsr.SOURCE_AUDIO_PCM, audio) .setString(Snsr.USER, userTag) .setHandler(Snsr.FAIL_EVENT, (ses, key) -> { System.out.println("This enrollment recording is not usable."); System.out.println(" Reason: " + ses.getString(Snsr.RES_REASON)); System.out.println(" Fix: " + ses.getString(Snsr.RES_GUIDANCE)); return SnsrRC.OK; }) .setHandler(Snsr.PASS_EVENT, (ses, key) -> { System.out.println("Recording passes preliminary tests."); return SnsrRC.OK; }) .setHandler(Snsr.PROG_EVENT, (ses, key) -> { double p = ses.getDouble(Snsr.RES_PERCENT_DONE); System.out.print(String.format("\rAdapting: %3.0f%% done. ", p)); if (p >= 100) System.out.println(""); return SnsrRC.OK; }) .setHandler(Snsr.PAUSE_EVENT, (ses, key) -> { // Pause recording while processing. System.out.println(""); audio.close(); return SnsrRC.OK; }) .setHandler(Snsr.RESUME_EVENT, (ses, key) -> { try { // Restart recording. audio.open(); } catch (Exception e) { e.printStackTrace(); } String prompt = SpeakNow + " (" + (ses.getInt(Snsr.RES_ENROLLMENT_COUNT) + 1) + "/" + ses.getInt(Snsr.ENROLLMENT_TARGET) + ")"; if (ses.getInt(Snsr.ADD_CONTEXT) != 0) { prompt += " with context,\n for example: " + "\" will it rain tomorrow?\""; } System.out.println(prompt); return SnsrRC.OK; }) .setHandler(Snsr.DONE_EVENT, (ses, key) -> { SnsrStream out = SnsrStream.fromFileName(BuildConfig.ENROLLED_MODEL, "w"); try { out.copy(ses.getStream(Snsr.MODEL_STREAM)); System.out.println("Enrolled model saved to " + BuildConfig.ENROLLED_MODEL); } catch (Exception e) { e.printStackTrace(); } out.close(); System.out.println("Done!"); return SnsrRC.STOP; }) // Optional: save enrollment context // Use Snsr.ENROLLED_EVENT to save the // unadapted enrollment context instead. .setHandler(Snsr.ADAPTED_EVENT, (ses, key) -> { ses.save(SnsrDataFormat.RUNTIME, EnrollmentContext); System.out.println("Enrollment context saved to " + EnrollmentContext); return SnsrRC.OK; }) // Show audio recording duration .setHandler(Snsr.SAMPLES_EVENT, (ses, key) -> { double count = ses.getDouble(Snsr.RES_SAMPLES); System.out.print(String.format("\rRecording: %6.2f s ", count / SAMPLE_RATE)); return SnsrRC.OK; }); try { s.run(); // Optional but good practice. finalize() will (eventually) release. s.release(); audio.release(); } catch (IOException e) { e.printStackTrace(); } } } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/java/evalUDT.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/java/evalUDT/" --- # evalUDT.java This example shows how to run a wake word recognizer. It uses the UDT phrase enrolled with [enrollUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT.md#enrolludtjava). ## Instructions Enroll a custom wake word as outlined in [enrollUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT.md#enrolludt-instructions). Open a terminal window and enter the commands after the `%` prompt below (on Windows, replace `./gradlew` with `gradlew.bat`). Speak when prompted. Stop the process with `^C` when you're done. ```console % cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/ % ./gradlew -q --console=plain eval Say your enrolled phrase. #00 "custom-phrase", score = 0.700 [2505 ms, 3480 ms] custom-phrase #01 "custom-phrase", score = 0.636 [6660 ms, 7425 ms] custom-phrase #02 "custom-phrase", score = 0.668 [11475 ms, 12525 ms] custom-phrase Recording: 14.50 ^C ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/src/main/java/com/sensory/speech/snsr/demo/enroll-udt/evalUDT.java_ **evalUDT.java:** ```java /* Sensory Confidential * Copyright (C)2016-2026 Sensory, Inc. https://sensory.com/ * * Command-line phrase spotter. *------------------------------------------------------------------------------ */ import java.io.Console; import java.io.File; import java.io.IOException; import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrRC; import com.sensory.speech.snsr.SnsrSession; import com.sensory.speech.snsr.SnsrStream; import enroll.BuildConfig; public class evalUDT { public static void main(String argv[]) { final int TIMEOUT = 60; final int SAMPLE_RATE = 16000; // Check whether the enrolled spotter model exists if (!(new File(BuildConfig.ENROLLED_MODEL).exists())) { System.out.println("Enrollment model file " + BuildConfig.ENROLLED_MODEL); System.out.println("was not found. Please enroll a phrase by running:" + "./gradlew -q enroll"); System.exit(1); } // Holder for the number of spots encountered so far final int[] spotCount = new int[1]; // Spot from live audio SnsrStream audio = SnsrStream.fromAudioDevice(); // Primary TrulyHandsfree session handle SnsrSession s = new SnsrSession(); try { s.load(BuildConfig.ENROLLED_MODEL); } catch (IOException e) { e.printStackTrace(); System.exit(3); } s.require(Snsr.TASK_TYPE, Snsr.PHRASESPOT) // .setDouble(Snsr.SV_THRESHOLD, 0.1) // test - override default .setStream(Snsr.SOURCE_AUDIO_PCM, audio) // Show the duration of processed audio, // and stop after TIMEOUT seconds .setHandler(Snsr.SAMPLES_EVENT, (ses, key) -> { double count = ses.getDouble(Snsr.RES_SAMPLES); System.out.print(String.format("\rRecording: %6.2f s", count / SAMPLE_RATE)); if (count < SAMPLE_RATE * TIMEOUT) return SnsrRC.OK; return SnsrRC.TIMED_OUT; }) // Phrase spot event. Show speaker verification score and alignments. .setHandler(Snsr.RESULT_EVENT, (ses, key) -> { System.out.println(String.format("\r#%02d \"%s\", score = %.3f", spotCount[0]++, ses.getString("text"), ses.getDouble("sv-score"))); // Replace Snsr.WORD_LIST with Snsr.PHONE_LIST to show phonemes ses.forEach(Snsr.WORD_LIST, (s2, key2) -> { System.out.println(String.format(" [%.0f ms, %.0f ms] %s", s2.getDouble(Snsr.RES_BEGIN_MS), s2.getDouble(Snsr.RES_END_MS), s2.getString(Snsr.RES_TEXT))); return SnsrRC.OK; }); System.out.println(""); return SnsrRC.OK; }); // Show an SDK license expiration warning, if needed final String licenseWarning = s.getString(Snsr.LICENSE_WARNING); if (licenseWarning != null) System.out.println(licenseWarning); System.out.println("Say your enrolled phrase."); try { s.run(); s.release(); audio.release(); } catch (IOException e) { e.printStackTrace(); } System.out.println("\nDone."); } } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/java/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/java/" --- # Java examples The Java sample programs and code snippets are available in _sample/java/_ in the TrulyNatural installation directory. See _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/_ New to the Session API? Start with [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program) (`FirstSpot.java`), then explore the samples below. ## Examples [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudiojava) - Runs a wake word followed by a VAD, and saves the captured audio to file. [enrollUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT.md#enrolludtjava) - Enrolls a user-defined wake word. [evalUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/evalUDT.md#evaludtjava) - Runs the wake word enrolled by [enrollUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT.md#enrolludtjava). [SnsrEnrollmentTest.java](https://doc.sensory.com/tnl/7.8/api/sample/java/SnsrEnrollmentTest.md#snsrenrollmenttestjava) - Unit tests for UDT enrollment and evaluation. [SnsrStreamAudioDeviceGeneric.java](https://doc.sensory.com/tnl/7.8/api/sample/java/SnsrStreamAudioDeviceGeneric.md#snsrstreamaudiodevicegenericjava) - Source for the [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) implementation for [Java Audio][]. [Java Audio]: https://docs.oracle.com/javase/tutorial/sound/capturing.html "Audio capturing in Java" *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets *[VAD]: Voice Activity Detector --- source_path: "api/sample/java/segmentSpottedAudio.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio/" --- # segmentSpottedAudio.java This example runs a phrase spotter followed by a VAD. It saves the VAD-segmented audio to file. ## Instructions To run this example, open a terminal window and enter the commands after the `%` prompt below (on Windows, replace `./gradlew` with `gradlew.bat`). Speak when prompted. ```console % cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/ % ./gradlew -q --console=plain segment Say: "Hello Blue Genie will it rain in Portland tomorrow?" Found "hello blue genie"... listening VAD start detected. VAD endpoint ^end Found speech from 6990.000 ms to 8505.000 ms Wrote recording to "vad-audio.wav" ``` _vad-audio.wav_ captured above includes only the speech following the "Hello Blue Genie" wake word. You can change this to also include the wake word audio by setting the `include-spot` property: ```console % ./gradlew -q --console=plain segment -Pinclude-spot Say: "Hello Blue Genie will it rain in Portland tomorrow?" Found "hello blue genie"... listening VAD start detected. VAD endpoint ^end Found speech from 1620.000 ms to 3960.000 ms Wrote recording to "vad-audio.wav" ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/java/enroll-udt/src/main/java/com/sensory/speech/snsr/demo/enroll-udt/segmentSpottedAudio.java_ **segmentSpottedAudio.java:** ```java /* Sensory Confidential * Copyright (C)2017-2026 Sensory, Inc. https://sensory.com/ * * Command-line phrase spotter, runs trailing audio through VAD * and saves this speech-detected audio to file. *------------------------------------------------------------------------------ */ import java.io.Console; import java.io.File; import java.io.IOException; import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrRC; import com.sensory.speech.snsr.SnsrSession; import com.sensory.speech.snsr.SnsrStream; import enroll.BuildConfig; public class segmentSpottedAudio { public static void main(String argv[]) { final String VAD_AUDIO_FILE = "vad-audio.wav"; final Boolean INCLUDE_SPOT = (argv.length == 1); // Spot from live audio SnsrStream audio = SnsrStream.fromAudioDevice(); // Primary TrulyHandsfree session handle SnsrSession s = new SnsrSession(); try { // Load and validate the spot-vad template model. s.load(BuildConfig.VAD_TEMPLATE); s.require(Snsr.TASK_TYPE, Snsr.PHRASESPOT_VAD); // Fill in slot #0 with a phrase spotter. s.setStream(Snsr.SLOT_0, SnsrStream.fromFileName(BuildConfig.HBG_MODEL, "r")); } catch (IOException e) { e.printStackTrace(); System.exit(3); } // Output file for VAD-selected audio. SnsrStream out = SnsrStream.fromAudioFile(VAD_AUDIO_FILE, "w"); // Configure session s.setStream(Snsr.SOURCE_AUDIO_PCM, audio) .setStream(Snsr.SINK_AUDIO_PCM, out) .setInt(Snsr.INCLUDE_LEADING_SILENCE, INCLUDE_SPOT ? 1 : 0) .setInt(Snsr.BACKOFF, 0) // reduce VAD audio margins to the minimum .setInt(Snsr.HOLD_OVER, 0) // Phrase spot event. .setHandler(Snsr.RESULT_EVENT, (session, key) -> { System.out.println(String.format("Found \"%s\"... listening", session.getString("text"))); return SnsrRC.OK; }); // VAD endpoint callback SnsrSession.Listener endpoint = (ses, key) -> { double from = ses.getDouble(Snsr.RES_BEGIN_MS); double to = ses.getDouble(Snsr.RES_END_MS); final String msg = String.format("Found speech from %.3f ms to %.3f ms", from, to); System.out.println("VAD endpoint " + key + "\n" + msg); // Stop after one VAD endpoint detection. return SnsrRC.STOP; }; // Wire up handlers for the VAD events. s.setHandler(Snsr.END_EVENT, endpoint) .setHandler(Snsr.LIMIT_EVENT, endpoint) .setHandler(Snsr.BEGIN_EVENT, (session, key) -> { System.out.println("VAD start detected."); return SnsrRC.OK; }) .setHandler(Snsr.SILENCE_EVENT, (session, key) -> { System.out.println("VAD endpoint " + key + "\n" + "Listening for \"Hello Blue Genie\"."); return SnsrRC.OK; }); // Show an SDK license expiration warning, if needed final String licenseWarning = s.getString(Snsr.LICENSE_WARNING); if (licenseWarning != null) System.out.println(licenseWarning); System.out.println("Say: \"Hello Blue Genie will it rain " + "in Portland tomorrow?\""); try { s.run(); } catch (IOException e) { e.printStackTrace(); System.exit(4); } s.release(); out.release(); System.out.println("Wrote recording to \"" + VAD_AUDIO_FILE + "\""); } } ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets *[VAD]: Voice Activity Detector --- source_path: "api/sample/python/custom_stream.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/python/custom_stream/" --- # custom_stream.py This example demonstrates [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider) from Python. It reads a WAV file into memory, strips the RIFF header, and serves raw PCM bytes through a Python `read` callback. The SDK pulls from that callback on its processing thread. ## Instructions 1. Set up the sample project environment: ```console cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python uv venv uv sync ``` 2. Run the sample: ```console uv run src/custom_stream.py ``` The sample prints the phrase-spotter result received from the custom stream. ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python/src/custom_stream.py_ **custom_stream.py:** ```python """Custom audio stream for the TrulyNatural SDK Python binding. Demonstrates ``Stream.from_provider``: build a ``snsr.Stream`` from Python callbacks and hand it to a session via ``Session.set_stream``. This is the right shape when the audio does not live in a regular file -- for example a network socket, an in-memory ring buffer, or a hardware abstraction that does not look like a microphone. Here we keep the example small by reading the same WAV file as ``hello_world.py`` into memory once, stripping the 44-byte RIFF WAVE header, and then serving the raw PCM through a Python ``read`` callback. The SDK pulls one chunk at a time on its processing thread; the callback fills the buffer and returns the number of bytes written, returning a value smaller than the buffer to signal end-of-stream. Modelled on ``tests/test_snsr.py::test_Stream_from_provider_read``. Usage:: uv run src/custom_stream.py [--sdk-root PATH] """ from __future__ import annotations import argparse import sys from pathlib import Path import snsr MODEL = "spot-voicegenie-enUS-6.5.1-m.snsr" AUDIO = "voice-genie-set-cruise-control.wav" RIFF_HEADER_BYTES = 44 def default_sdk_root() -> Path: return Path(__file__).resolve().parents[3] def parse_args(argv: list[str] | None = None) -> argparse.Namespace: parser = argparse.ArgumentParser(description=__doc__.splitlines()[0]) parser.add_argument( "--sdk-root", type=Path, default=default_sdk_root(), help="TrulyNatural SDK install root (default: auto-detect)", ) return parser.parse_args(argv) def make_pcm_provider(pcm: bytes) -> snsr.Stream: """Return a snsr.Stream that yields ``pcm`` over ``read`` callbacks.""" offset = 0 def read(buffer: bytearray) -> int: nonlocal offset n = min(len(buffer), len(pcm) - offset) buffer[:n] = pcm[offset : offset + n] offset += n return n return snsr.Stream.from_provider(read=read) def run_custom_stream(model_path: Path, audio_path: Path) -> int: """Run the spotter on ``audio_path`` via a from_provider Stream.""" pcm = audio_path.read_bytes()[RIFF_HEADER_BYTES:] count = 0 def on_result(s: snsr.Session, _key: str) -> snsr.RC | None: nonlocal count count += 1 text = s.get_string(snsr.RES_TEXT) begin_ms = s.get_int(snsr.RES_BEGIN_MS) end_ms = s.get_double(snsr.RES_END_MS) score = s.get_double(snsr.RES_SCORE) print(f" spotted {text!r}: {begin_ms:>5} ms - {end_ms:>7.1f} ms (score {score:.4f})") print(f"snsr {snsr.VERSION}") print(f" model: {model_path}") print(f" audio: {audio_path} ({len(pcm)} bytes of PCM via from_provider)") print() with snsr.Session(str(model_path)) as s: s.require(snsr.TASK_TYPE, snsr.PHRASESPOT) s.set_handler(snsr.RESULT_EVENT, on_result) with make_pcm_provider(pcm) as audio: s.set_stream(snsr.SOURCE_AUDIO_PCM, audio) s.run() print() print(f"done: {count} result event(s)") return count def main(argv: list[str] | None = None) -> int: args = parse_args(argv) sdk_root: Path = args.sdk_root.resolve() model_path = sdk_root / "model" / MODEL audio_path = sdk_root / "data" / "audio" / AUDIO for label, path in (("model", model_path), ("audio", audio_path)): if not path.is_file(): print(f"error: {label} not found: {path}", file=sys.stderr) print( f"hint: pass --sdk-root pointing at a TrulyNatural SDK install", file=sys.stderr, ) return 2 run_custom_stream(model_path, audio_path) return 0 if __name__ == "__main__": raise SystemExit(main()) ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/python/hello_world.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/python/hello_world/" --- # hello_world.py New to the Python Session API? Start with the **Python** tab in [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program), then return here for the complete sample. This example loads a phrase-spotter model, runs it across a sample WAV file in pull mode, and prints each spotted phrase as it arrives. ## Instructions 1. Set up the sample project environment: ```console cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python uv venv uv sync ``` 2. Run the sample: ```console uv run src/hello_world.py ``` The sample prints the SDK version, model and audio file paths, then one line per [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) event. ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python/src/hello_world.py_ **hello_world.py:** ```python """Hello world for the TrulyNatural SDK Python binding. Loads a phrase-spotter model and runs it across a sample WAV file, printing each spotted phrase as it arrives. Modelled on ``tests/test_snsr.py::test_Session_run_spotter`` from the ``snsr`` binding test suite, trimmed to the minimum needed to demonstrate the public API. Usage:: uv run src/hello_world.py [--sdk-root PATH] The default ``--sdk-root`` is the SDK install directory containing this sample (``/sample/python/`` -> ````). """ from __future__ import annotations import argparse import sys from pathlib import Path import snsr MODEL = "spot-voicegenie-enUS-6.5.1-m.snsr" AUDIO = "voice-genie-set-cruise-control.wav" def default_sdk_root() -> Path: # /sample/python/src/hello_world.py # parents[0] = src/ # parents[1] = python/ # parents[2] = sample/ # parents[3] = return Path(__file__).resolve().parents[3] def parse_args(argv: list[str] | None = None) -> argparse.Namespace: parser = argparse.ArgumentParser(description=__doc__.splitlines()[0]) parser.add_argument( "--sdk-root", type=Path, default=default_sdk_root(), help="TrulyNatural SDK install root (default: auto-detect)", ) return parser.parse_args(argv) def run_spotter(model_path: Path, audio_path: Path) -> int: """Run the spotter on ``audio_path`` and print one line per result. Returns the number of result events received. """ count = 0 def on_result(s: snsr.Session, _key: str) -> snsr.RC | None: nonlocal count count += 1 text = s.get_string(snsr.RES_TEXT) begin_ms = s.get_int(snsr.RES_BEGIN_MS) end_ms = s.get_double(snsr.RES_END_MS) score = s.get_double(snsr.RES_SCORE) print(f" spotted {text!r}: {begin_ms:>5} ms - {end_ms:>7.1f} ms (score {score:.4f})") print(f"snsr {snsr.VERSION}") print(f" model: {model_path}") print(f" audio: {audio_path}") print() with snsr.Session(str(model_path)) as s: s.require(snsr.TASK_TYPE, snsr.PHRASESPOT) s.set_handler(snsr.RESULT_EVENT, on_result) s.set_stream(snsr.SOURCE_AUDIO_PCM, str(audio_path)) s.run() print() print(f"done: {count} result event(s)") return count def main(argv: list[str] | None = None) -> int: args = parse_args(argv) sdk_root: Path = args.sdk_root.resolve() model_path = sdk_root / "model" / MODEL audio_path = sdk_root / "data" / "audio" / AUDIO for label, path in (("model", model_path), ("audio", audio_path)): if not path.is_file(): print(f"error: {label} not found: {path}", file=sys.stderr) print(f"hint: pass --sdk-root pointing at a TrulyNatural SDK install", file=sys.stderr) return 2 run_spotter(model_path, audio_path) return 0 if __name__ == "__main__": raise SystemExit(main()) ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/python/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/python/" --- # Python examples The Python sample projects are available in _sample/python/_ in the TrulyNatural installation directory. See _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python/_ New to the Session API? Start with the **Python** tab in [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program), then explore the samples below. ## Run the samples 1. Create the virtual environment and install the SDK wheel from _$HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/python/_: ```console cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python uv venv uv sync ``` 2. Run the smallest phrase-spotting sample: ```console uv run src/hello_world.py ``` 3. Run the sample acceptance tests: ```console uv run --group dev pytest ``` [live_audio.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_audio.md#live_audiopy) is excluded from the default sweep because CI machines usually have no microphone. Set `SNSR_RUN_LIVE_AUDIO=1` to include a short live-capture run. ## Examples [hello_world.py](https://doc.sensory.com/tnl/7.8/api/sample/python/hello_world.md#hello_worldpy) - Loads a wake-word model, runs pull-mode inference from a WAV file, and prints each result event. [stt_push.py](https://doc.sensory.com/tnl/7.8/api/sample/python/stt_push.md#stt-push-py) - Runs push-mode Speech-to-Text by feeding WAV audio chunks with [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push). On SDK builds without STT, it exits cleanly with an "STT not supported" message. [custom_stream.py](https://doc.sensory.com/tnl/7.8/api/sample/python/custom_stream.md#custom_streampy) - Implements a custom input source with [fromProvider](https://doc.sensory.com/tnl/7.8/api/io.md#fromprovider) and a zero-copy Python `memoryview` callback. [live_audio.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_audio.md#live_audiopy) - Captures from the default microphone with [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) and runs a wake-word spotter for a fixed duration. [live_enroll.py](https://doc.sensory.com/tnl/7.8/api/sample/python/live_enroll.md#live_enrollpy) - Runs interactive UDT wake-word enrollment from recordings or live microphone input. *[API]: Application Programming Interface *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/python/live_audio.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/python/live_audio/" --- # live_audio.py This example captures live audio from the host's default microphone with [fromAudioDevice](https://doc.sensory.com/tnl/7.8/api/io.md#fromaudiodevice) and runs a wake-word spotter for a fixed duration. It needs a real default capture device. The acceptance test only smoke-tests `--help` unless `SNSR_RUN_LIVE_AUDIO=1` is set. ## Instructions 1. Set up the sample project environment: ```console cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python uv venv uv sync ``` 2. Run the sample and say "voice genie" during the capture window: ```console uv run src/live_audio.py --duration 10 ``` Increase `--duration` if you need more time before speaking. ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python/src/live_audio.py_ **live_audio.py:** ```python """Live microphone capture for the TrulyNatural SDK Python binding. Opens the host's default audio input device via ``Stream.from_audio_device``, wires it to a phrase-spotter session, and prints results as they arrive in real time. Capture runs for ``--duration`` seconds (default 10). This sample exercises the SDK's host-audio backend (ALSA on Linux, Audio Queue Services on macOS, the Windows Multimedia Extensions wave API on Windows). It needs a real default capture device and is **not** part of the default acceptance test sweep because CI runners typically have no microphone; the acceptance test only verifies that the script imports and parses arguments cleanly. To run it for real, plug in a microphone and execute:: uv run src/live_audio.py [--sdk-root PATH] [--duration SECS] then say "voice genie" within the capture window. The opt-in env toggle ``SNSR_RUN_LIVE_AUDIO=1`` flips the acceptance test from help-only to a full live-audio run. """ from __future__ import annotations import argparse import sys from pathlib import Path import snsr MODEL = "spot-voicegenie-enUS-6.5.1-m.snsr" DEFAULT_DURATION_S = 10.0 SAMPLES_PER_SECOND = 16_000 # snsr default capture format BYTES_PER_SAMPLE = 2 # 16-bit LPCM CHUNK_BYTES = 480 def default_sdk_root() -> Path: return Path(__file__).resolve().parents[3] def parse_args(argv: list[str] | None = None) -> argparse.Namespace: parser = argparse.ArgumentParser(description=__doc__.splitlines()[0]) parser.add_argument( "--sdk-root", type=Path, default=default_sdk_root(), help="TrulyNatural SDK install root (default: auto-detect)", ) parser.add_argument( "--duration", type=float, default=DEFAULT_DURATION_S, help=f"capture duration in seconds (default: {DEFAULT_DURATION_S})", ) return parser.parse_args(argv) def run_live_audio(model_path: Path, duration_s: float) -> int: """Run the spotter on the default capture device for ``duration_s``.""" count = 0 def on_result(s: snsr.Session, _key: str) -> snsr.RC | None: nonlocal count count += 1 text = s.get_string(snsr.RES_TEXT) score = s.get_double(snsr.RES_SCORE) print(f" spotted {text!r} (score {score:.4f})") print(f"snsr {snsr.VERSION}") print(f" model: {model_path}") print(f" capturing for {duration_s:.1f}s; say 'voice genie'...") print() bytes_to_capture = int(duration_s * SAMPLES_PER_SECOND) * BYTES_PER_SAMPLE captured = 0 with snsr.Session(str(model_path)) as s: s.require(snsr.TASK_TYPE, snsr.PHRASESPOT) s.set_handler(snsr.RESULT_EVENT, on_result) with snsr.Stream.from_audio_device() as mic: while captured < bytes_to_capture: chunk = mic.read(CHUNK_BYTES) if not chunk: break s.push(snsr.SOURCE_AUDIO_PCM, chunk) captured += len(chunk) s.stop() print() print(f"done: {count} result event(s) over {captured / (SAMPLES_PER_SECOND * BYTES_PER_SAMPLE):.1f}s of audio") return count def main(argv: list[str] | None = None) -> int: args = parse_args(argv) sdk_root: Path = args.sdk_root.resolve() model_path = sdk_root / "model" / MODEL if not model_path.is_file(): print(f"error: model not found: {model_path}", file=sys.stderr) print( f"hint: pass --sdk-root pointing at a TrulyNatural SDK install", file=sys.stderr, ) return 2 if args.duration <= 0: print(f"error: --duration must be positive, got {args.duration}", file=sys.stderr) return 2 run_live_audio(model_path, args.duration) return 0 if __name__ == "__main__": raise SystemExit(main()) ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/sample/python/live_enroll.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/python/live_enroll/" --- # live_enroll.py This example runs User-Defined Trigger (UDT) wake-word enrollment from Python. By default it feeds four pre-recorded enrollment WAV files from _data/enrollments/_. Pass `--live` to capture enrollment audio from the default microphone instead. ## Instructions 1. Set up the sample project environment: ```console cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python uv venv uv sync ``` 2. Run file-based enrollment with the default `armadillo-1` recordings: ```console uv run src/live_enroll.py ``` The sample writes _enrolled-sv.snsr_ unless you pass `--output`. 3. To enroll from live microphone input instead: ```console uv run src/live_enroll.py --live --user my-wake-word ``` ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python/src/live_enroll.py_ **live_enroll.py:** ```python """Live wake-word enrollment for the TrulyNatural SDK Python binding. Loads a User-Defined Trigger (UDT) enrollment task and walks through interactive enrollment: assign users, capture enrollment utterances, run quality checks, and write an enrolled spotter model. This mirrors the ``live-enroll`` command-line tool and its C sample (``live-enroll.c``). By default the sample feeds pre-recorded enrollment WAV files from ``/data/enrollments/`` (the same ``armadillo-1`` set used by ``live-enroll`` in CI). Pass ``--live`` to capture from the host's default microphone instead. Usage:: uv run src/live_enroll.py [--sdk-root PATH] [options] uv run src/live_enroll.py --live --user my-wake-word """ from __future__ import annotations import argparse import sys from dataclasses import dataclass from pathlib import Path import snsr TASK_MODEL = "udt-universal-3.67.1.0.snsr" ENROLL_TASK_VERSION = "~0.8.0 || 1.0.0" DEFAULT_OUTPUT = "enrolled-sv.snsr" DEFAULT_USER = "armadillo-1" DEFAULT_ACCURACY = 0.1 SAMPLES_PER_SECOND = 16_000 # Same four utterances as devkit-live-enroll-1.6 in live-enroll.test. DEFAULT_AUDIO = tuple(f"armadillo-1-{i}.wav" for i in range(4)) def default_sdk_root() -> Path: return Path(__file__).resolve().parents[3] def enrollment_audio_dir(sdk_root: Path) -> Path: return sdk_root / "data" / "enrollments" @dataclass class EnrollState: """Mutable state shared by enrollment event handlers.""" users: list[str] model_path: Path enroll_path: Path | None prefix: str | None verbosity: int audio: snsr.Stream | None = None current_user: int = 0 phrase: str = "the enrollment phrase" fail_count: int = 0 def parse_args(argv: list[str] | None = None) -> argparse.Namespace: parser = argparse.ArgumentParser( description=__doc__.splitlines()[0], formatter_class=argparse.RawDescriptionHelpFormatter, epilog=( "Without --live, enrollment audio defaults to four WAV files under\n" f" /data/enrollments/{DEFAULT_AUDIO[0]} ... {DEFAULT_AUDIO[-1]}\n" "Pass additional paths after the options to use your own recordings." ), ) parser.add_argument( "--sdk-root", type=Path, default=default_sdk_root(), help="TrulyNatural SDK install root (default: auto-detect)", ) parser.add_argument( "--task", type=Path, help=f"enrollment task model (default: /model/{TASK_MODEL})", ) parser.add_argument( "--output", "-o", type=Path, default=Path(DEFAULT_OUTPUT), help=f"enrolled spotter output path (default: {DEFAULT_OUTPUT})", ) parser.add_argument( "--enroll", "-e", type=Path, help="optional enrollment-context output path", ) parser.add_argument( "--prefix", "-p", type=str, help="save each enrollment capture as --{pass,fail}-.wav", ) parser.add_argument( "--user", action="append", default=[], metavar="NAME", help="user to enroll (repeat for multiple; default: armadillo-1)", ) parser.add_argument( "--accuracy", type=float, default=DEFAULT_ACCURACY, help=f"enrollment accuracy setting (default: {DEFAULT_ACCURACY})", ) parser.add_argument( "--live", action="store_true", help="capture from the default microphone instead of WAV files", ) parser.add_argument( "-v", "--verbose", action="count", default=0, help="increase status output (repeat up to three times)", ) parser.add_argument( "audio", nargs="*", type=Path, help="enrollment WAV file(s); ignored when --live is set", ) return parser.parse_args(argv) def resolve_paths(args: argparse.Namespace) -> tuple[Path, Path, list[Path]]: sdk_root = args.sdk_root.resolve() task_path = (args.task or sdk_root / "model" / TASK_MODEL).resolve() audio_paths: list[Path] = [] if not args.live: if args.audio: audio_paths = [p.resolve() for p in args.audio] else: enroll_dir = enrollment_audio_dir(sdk_root) audio_paths = [enroll_dir / name for name in DEFAULT_AUDIO] return sdk_root, task_path, audio_paths def build_audio_stream(paths: list[Path]) -> snsr.Stream: """Concatenate enrollment WAVs into one PCM stream (``live-enroll`` file mode).""" chain = snsr.Stream.from_string("") for path in paths: wav = snsr.Stream.from_audio_file(str(path)) chain = snsr.Stream.from_streams(chain, wav) return chain def save_enrollment_audio( s: snsr.Session, state: EnrollState, tag: str, enroll_id: int ) -> None: if not state.prefix: return user = s.get_string(snsr.USER) if isinstance(user, bytes): user = user.decode() dash = "-" if state.prefix else "" out_name = f"{state.prefix}{dash}{user}-{tag}-{enroll_id}.wav" enrollment = s.get_stream(snsr.AUDIO_STREAM) if enrollment.rc != snsr.RC.OK: return with snsr.Stream.from_audio_file(out_name, "w") as out: out.copy(enrollment, 2**63 - 1) if out.rc not in (snsr.RC.OK, snsr.RC.EOF): raise snsr.Error(out.rc, message=out.error_detail) if state.verbosity >= 1: print(f"Saved enrollment audio to {out_name}") def print_reason(s: snsr.Session, state: EnrollState) -> None: if s.get_int(snsr.RES_REASON_PASS): return reason = s.get_string(snsr.RES_REASON) guidance = s.get_string(snsr.RES_GUIDANCE) if isinstance(reason, bytes): reason = reason.decode() if isinstance(guidance, bytes): guidance = guidance.decode() print("This enrollment recording is not usable.", file=sys.stderr) print(f" Reason: {reason}", file=sys.stderr) if state.verbosity >= 2: value = s.get_double(snsr.RES_REASON_VALUE) threshold = s.get_double(snsr.RES_REASON_THRESHOLD) print(f" ({value:.2f}, threshold is {threshold:.2f})", file=sys.stderr) print(f" Fix: {guidance}", file=sys.stderr) def install_handlers(s: snsr.Session, state: EnrollState) -> None: def next_event(sess: snsr.Session, _key: str) -> snsr.RC | None: if state.current_user >= len(state.users): return snsr.RC.OK user = state.users[state.current_user] state.current_user += 1 sess.set_string(snsr.USER, user) return snsr.RC.OK def pass_event(sess: snsr.Session, _key: str) -> snsr.RC | None: if state.verbosity >= 1: print("Preliminary enrollment checks passed.") if state.prefix: enroll_id = sess.get_int(snsr.RES_ENROLLMENT_ID) save_enrollment_audio(sess, state, "pass", enroll_id) return snsr.RC.OK def fail_event(sess: snsr.Session, _key: str) -> snsr.RC | None: print_reason(sess, state) if state.prefix: save_enrollment_audio(sess, state, "fail", state.fail_count) state.fail_count += 1 return snsr.RC.OK def pause_event(_sess: snsr.Session, _key: str) -> snsr.RC | None: if state.audio is not None: state.audio.close() print() return snsr.RC.OK def resume_event(sess: snsr.Session, _key: str) -> snsr.RC | None: if state.audio is not None: state.audio.open() count = sess.get_int(snsr.RES_ENROLLMENT_COUNT) target = sess.get_int(snsr.ENROLLMENT_TARGET) user = sess.get_string(snsr.USER) if isinstance(user, bytes): user = user.decode() ctx = sess.get_int(snsr.ADD_CONTEXT) print(f'\nSay {state.phrase} ({count + 1}/{target}) for "{user}"', end="") if ctx: print( ',\n for example: " will it rain tomorrow?"', end="", ) print() return snsr.RC.OK def samples_event(sess: snsr.Session, _key: str) -> snsr.RC | None: seconds = sess.get_double(snsr.RES_SAMPLES) / SAMPLES_PER_SECOND print(f"Recording: {seconds:6.2f} s\r", end="", flush=True) return snsr.RC.OK def prog_event(sess: snsr.Session, _key: str) -> snsr.RC | None: if state.verbosity >= 1: progress = sess.get_double(snsr.RES_PERCENT_DONE) print(f"\rAdapting: {progress:3.0f}% complete.", end="", flush=True) if progress >= 100: print() return snsr.RC.OK def done_event(sess: snsr.Session, _key: str) -> snsr.RC | None: model = sess.get_stream(snsr.MODEL_STREAM) written = model.get_meta(snsr.StreamMeta.BYTES_WRITTEN) state.model_path.parent.mkdir(parents=True, exist_ok=True) with snsr.Stream.from_filename(str(state.model_path), "w") as out: out.copy(model, written) if out.rc != snsr.RC.OK: raise snsr.Error(out.rc, message=out.error_detail) if state.verbosity >= 1: print(f'Enrolled model saved to "{state.model_path}"') return snsr.RC.STOP def enrolled_event(sess: snsr.Session, _key: str) -> snsr.RC | None: if state.enroll_path is None: return snsr.RC.OK state.enroll_path.parent.mkdir(parents=True, exist_ok=True) sess.save(snsr.DataFormat.RUNTIME, str(state.enroll_path)) if state.verbosity >= 1: print(f'Enrollment context saved to "{state.enroll_path}"') return snsr.RC.OK def capture_phrase(sess: snsr.Session, _key: str) -> snsr.RC | None: vocab = sess.get_string(snsr.RES_TEXT) if isinstance(vocab, bytes): vocab = vocab.decode() state.phrase = vocab return snsr.RC.OK s.set_handler(snsr.NEXT_EVENT, next_event) s.set_handler(snsr.DONE_EVENT, done_event) s.set_handler(snsr.FAIL_EVENT, fail_event) s.set_handler(snsr.PASS_EVENT, pass_event) s.set_handler(snsr.PROG_EVENT, prog_event) s.set_handler(snsr.PAUSE_EVENT, pause_event) s.set_handler(snsr.RESUME_EVENT, resume_event) s.set_handler(snsr.SAMPLES_EVENT, samples_event) if state.enroll_path is not None: s.set_handler(snsr.ENROLLED_EVENT, enrolled_event) try: s.for_each(snsr.VOCAB_LIST, capture_phrase) except snsr.Error: pass def run_enrollment( task_path: Path, audio_paths: list[Path], *, live: bool, users: list[str], output: Path, enroll: Path | None, prefix: str | None, accuracy: float, verbosity: int, ) -> None: state = EnrollState( users=users, model_path=output.resolve(), enroll_path=enroll.resolve() if enroll else None, prefix=prefix, verbosity=verbosity, ) print(f"snsr {snsr.VERSION}") print(f" task: {task_path}") if live: print(" audio: default capture device") else: print(f" audio: {len(audio_paths)} file(s)") print(f" users: {', '.join(users)}") print(f" output: {state.model_path}") if state.enroll_path: print(f" enroll: {state.enroll_path}") print() with snsr.Session(str(task_path)) as s: s.require(snsr.TASK_TYPE, snsr.ENROLL) s.require(snsr.TASK_VERSION, ENROLL_TASK_VERSION) s.set_int(snsr.INTERACTIVE_MODE, 1) s.set_double(snsr.ACCURACY, accuracy) if prefix: s.set_int(snsr.SAVE_ENROLLMENT_AUDIO, 1) install_handlers(s, state) if live: state.audio = snsr.Stream.from_audio_device() else: state.audio = build_audio_stream(audio_paths) with state.audio: s.set_stream(snsr.SOURCE_AUDIO_PCM, state.audio) rc = s.run() if rc not in (snsr.RC.OK, snsr.RC.STOP): raise snsr.Error(rc, message=snsr.Session.rc_message(rc)) def main(argv: list[str] | None = None) -> int: args = parse_args(argv) sdk_root, task_path, audio_paths = resolve_paths(args) users = args.user or [DEFAULT_USER] if not task_path.is_file(): print(f"error: enrollment task not found: {task_path}", file=sys.stderr) print( f"hint: pass --sdk-root or --task pointing at a TrulyNatural SDK install", file=sys.stderr, ) return 2 if not args.live: missing = [p for p in audio_paths if not p.is_file()] if missing: print("error: enrollment audio not found:", file=sys.stderr) for path in missing: print(f" {path}", file=sys.stderr) print( "hint: pass WAV paths, use --live, or install SDK data/enrollments/", file=sys.stderr, ) return 2 try: run_enrollment( task_path, audio_paths, live=args.live, users=users, output=args.output, enroll=args.enroll, prefix=args.prefix, accuracy=args.accuracy, verbosity=args.verbose, ) except snsr.Error as e: print(f"error: {e.message}", file=sys.stderr) return 1 return 0 if __name__ == "__main__": raise SystemExit(main()) ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "api/sample/python/stt_push.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/sample/python/stt_push/" --- # stt_push.py _(STT only)_ This example shows push-mode Speech-to-Text. The application owns the audio source, reads a WAV file in small chunks, and feeds each chunk to the recognizer with [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push). STT support is a TrulyNatural-only feature. On builds that do not include STT, the sample prints "STT not supported" and exits successfully. ## Instructions 1. Set up the sample project environment: ```console cd ~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python uv venv uv sync ``` 2. Run the sample: ```console uv run src/stt_push.py ``` On STT-capable builds, the sample prints the final recognition result. ## Code Available in this TrulyNatural SDK installation at _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/sample/python/src/stt_push.py_ **stt_push.py:** ```python """Push-mode Speech-To-Text for the TrulyNatural SDK Python binding. Loads the automotive STT model and feeds a WAV file into the session in 480-byte chunks via ``Session.push``, printing each ``RESULT_EVENT`` as it arrives. This is the streaming counterpart to ``hello_world.py``'s pull-mode ``set_stream`` + ``run`` loop: the application owns the audio source and decides when to feed samples in, which is the right shape for live audio, network streams, or any other producer that does not look like a file. Modelled on ``tests/test_snsr.py::test_Session_push_spotter`` and ``::test_Session_run_stt_reset`` from the ``snsr`` binding test suite. STT support is a TrulyNatural-only feature. On builds that do not include it (notably TrulyHandsfree), the script prints a clear "STT not supported" line and exits 0 so it composes cleanly with the SDK's acceptance test. Usage:: uv run src/stt_push.py [--sdk-root PATH] """ from __future__ import annotations import argparse import sys from pathlib import Path import snsr MODEL = "stt-enUS-automotive-medium-2.3.15-pnc.snsr" AUDIO = "voice-genie-set-cruise-control.wav" CHUNK_BYTES = 480 def default_sdk_root() -> Path: return Path(__file__).resolve().parents[3] def parse_args(argv: list[str] | None = None) -> argparse.Namespace: parser = argparse.ArgumentParser(description=__doc__.splitlines()[0]) parser.add_argument( "--sdk-root", type=Path, default=default_sdk_root(), help="TrulyNatural SDK install root (default: auto-detect)", ) return parser.parse_args(argv) def stt_supported() -> bool: """Return True if the loaded ``snsr`` build supports STT. ``STT_SUPPORT`` is a session-level int setting; we open a transient session purely to query it and close it before loading the actual STT model. """ with snsr.Session() as s: return bool(s.get_int(snsr.STT_SUPPORT)) def run_stt_push(model_path: Path, audio_path: Path) -> int: """Push ``audio_path`` into a fresh STT session, return event count.""" count = 0 def on_result(s: snsr.Session, _key: str) -> snsr.RC | None: nonlocal count count += 1 text = s.get_string(snsr.RES_TEXT) print(f" result: {text!r}") print(f"snsr {snsr.VERSION}") print(f" model: {model_path}") print(f" audio: {audio_path}") print() with snsr.Session(str(model_path)) as s: s.set_handler(snsr.RESULT_EVENT, on_result) with snsr.Stream.from_audio_file(str(audio_path)) as audio: for chunk in iter(lambda: audio.read(CHUNK_BYTES), b""): s.push(snsr.SOURCE_AUDIO_PCM, chunk) s.stop() print() print(f"done: {count} result event(s)") return count def main(argv: list[str] | None = None) -> int: args = parse_args(argv) sdk_root: Path = args.sdk_root.resolve() if not stt_supported(): print( "STT not supported in this TrulyNatural build " "(snsr.STT_SUPPORT == 0); skipping." ) return 0 model_path = sdk_root / "model" / MODEL audio_path = sdk_root / "data" / "audio" / AUDIO for label, path in (("model", model_path), ("audio", audio_path)): if not path.is_file(): print(f"error: {label} not found: {path}", file=sys.stderr) print( f"hint: pass --sdk-root pointing at a TrulyNatural SDK install", file=sys.stderr, ) return 2 run_stt_push(model_path, audio_path) return 0 if __name__ == "__main__": raise SystemExit(main()) ``` *[API]: Application Programming Interface *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/setting-keys/configuration.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration/" --- # Configuration Configuration settings are both readable and writable and are part of task models; they are saved to [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) by [dup](https://doc.sensory.com/tnl/7.8/api/inference.md#dup) and [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save), and restored by [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load). Use these to change model or fine-tune model behavior. Models have reasonable defaults, so most applications set no configuration keys at runtime — defaults are baked into the `.snsr` model file at training time. Most salient are [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) for wake words and command sets, [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) and [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence) for VAD templates, [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval) for LVCSR and STT, and [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile) for STT models. Common reasons applications *do* override configuration include enrollment (see [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user), [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive)), [THF Micro][] export (see [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target)), and selecting an active template slot in multi-stage models (see [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot)). Use the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) `get` and `set` functions that match the type of the setting. Use [getInt](https://doc.sensory.com/tnl/7.8/api/inference.md#getters), for example, to read the _(int)_ value for [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point). ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## Audio I/O ### audio-stream-size - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_AUDIO_STREAM_SIZE, value); ``` **Java** ```java s.setInt(Snsr.AUDIO_STREAM_SIZE, value); ``` **Python** ```python s.set_int(snsr.AUDIO_STREAM_SIZE, value) ``` Input audio buffer size. The number of audio samples kept in a circular audio history buffer, accessible through [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream). Use this buffer to retrieve segmented audio using alignments ([begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms)) obtained in the [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result). Set to `0` to disable audio buffering. **Also see these related items:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream) ### samples-per-second - configuration - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_SAMPLE_RATE, &value); ``` **Java** ```java int value = s.getInt(Snsr.SAMPLE_RATE); ``` **Python** ```python value = s.get_int(snsr.SAMPLE_RATE) ``` Model sample rate in Hz. ## VAD & endpointing ### backoff - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_BACKOFF, value); ``` **Java** ```java s.setInt(Snsr.BACKOFF, value); ``` **Python** ```python s.set_int(snsr.BACKOFF, value) ``` Start point back-off in ms. Audio margin added before the start point found by a VAD. **Also see these related items:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample) ### hold-over - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_HOLD_OVER, value); ``` **Java** ```java s.setInt(Snsr.HOLD_OVER, value); ``` **Python** ```python s.set_int(snsr.HOLD_OVER, value) ``` Endpoint hold-over. Audio margin added after the endpoint found by a VAD. This is the amount of trailing silence to include in the segmentation. **Also see these related items:** [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample) ### include-leading-silence - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_INCLUDE_LEADING_SILENCE, value); ``` **Java** ```java s.setInt(Snsr.INCLUDE_LEADING_SILENCE, value); ``` **Python** ```python s.set_int(snsr.INCLUDE_LEADING_SILENCE, value) ``` Include leading silence in VAD output. Set to `1` to include all audio up to the endpoint in the [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out) output stream. Set to `0` to return to the default behavior, which discards leading silence. If this setting is used with a spot-VAD template such as [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type), [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type), or [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) the leading silence includes the trigger phrase. **Also see these related items:** [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) ### include-wake-word-audio - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_INCLUDE_WAKE_WORD_AUDIO, value); ``` **Java** ```java s.setInt(Snsr.INCLUDE_WAKE_WORD_AUDIO, value); ``` **Python** ```python s.set_int(snsr.INCLUDE_WAKE_WORD_AUDIO, value) ``` Include the wake word audio in VAD output When set to `1`, VAD templates [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type), [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type), and [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) include the wake word in the audio output. Set to `0` to return to the default behavior, where the output does not include the wake word audio. **Note:** This setting is a synonym for [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence) when used with these templates. If you set both `include-wake-word-audio` and `include-leading-silence`, `include-wake-word-audio` takes precedence. **Also see these related items:** [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) ### leading-silence - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_LEADING_SILENCE, value); ``` **Java** ```java s.setInt(Snsr.LEADING_SILENCE, value); ``` **Python** ```python s.set_int(snsr.LEADING_SILENCE, value) ``` VAD leading silence time-out, in ms. The VAD will invoke the [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) event handler if no speech is detected during the first `leading-silence` ms of processed audio. **Also see these related items:** [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence) ### max-recording - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_MAX_RECORDING, value); ``` **Java** ```java s.setInt(Snsr.MAX_RECORDING, value); ``` **Python** ```python s.set_int(snsr.MAX_RECORDING, value) ``` VAD maximum record duration, in ms. The VAD will invoke the [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) event handler if the detected speech segment exceeds this value. **Also see these related items:** [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) ### pass-through - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_PASS_THROUGH, value); ``` **Java** ```java s.setInt(Snsr.PASS_THROUGH, value); ``` **Python** ```python s.set_int(snsr.PASS_THROUGH, value) ``` VAD audio pass-through behavior. If set to `0`, no audio from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm) will be passed through to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out). The begin- and endpoint handlers will still be invoked. The default value, `1`, passes speech-detected samples to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out). **Also see these related items:** [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence) ### trailing-silence - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_TRAILING_SILENCE, value); ``` **Java** ```java s.setInt(Snsr.TRAILING_SILENCE, value); ``` **Python** ```python s.set_int(snsr.TRAILING_SILENCE, value) ``` VAD trailing silence time-out, in ms. The VAD will invoke the [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) event handler once `trailing-silence` ms of silence has followed the last bit of speech. **Also see these related items:** [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ## Wake word & command set ### delay - configuration - int - read-only - deprecated [6.16.0](https://doc.sensory.com/tnl/7.8/changes/version-6.md#v6.16.0) **C/C++** ```c int value; snsrGetInt(s, SNSR_SPOT_DELAY, &value); ``` **Java** ```java int value = s.getInt(Snsr.SPOT_DELAY); ``` **Python** ```python value = s.get_int(snsr.SPOT_DELAY) ``` Phrase spotter delay in ms. **Deprecated:** Support for this setting will be removed from the next major release of the TrulyNatural SDK. First deprecated in release 6.16.0 (2021-06-06) and made read-only in 7.0.0 (2023-11-20). The cumulative recognition score for a wake word or command recognizer can exceed the decision threshold before the end of the utterance. This setting controls how long the recognizer will wait while the recognition score is still increasing before reporting the event. Longer delays can increase the time alignment accuracy of the end of the spotted phrase. ### duration-ms - configuration - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_DURATION_MS, value); ``` **Java** ```java s.setDouble(Snsr.DURATION_MS, value); ``` **Python** ```python s.set_double(snsr.DURATION_MS, value) ``` Low false-reject listening window. Selects the time window in ms following a close false-reject that smart wake words will use [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point) instead of [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point). Defaults to `10` seconds if not explicitly set. **Also see these related items:** [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) ### listen-window - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_LISTEN_WINDOW, value); ``` **Java** ```java s.setInt(Snsr.LISTEN_WINDOW, value); ``` **Python** ```python s.set_int(snsr.LISTEN_WINDOW, value) ``` Phrase spot listening window in seconds or milliseconds. This is the duration that a spotter will listen for a command before timing out. Spotters with short listening windows are typically optimized to have lower false reject, but higher false accept rates. If this value is `120` or less it is in seconds. Values larger than `120` are in ms. In wake word spotters tuned for continuous listening this value is `0`. **Note:** type: note This value is only used when: * Converting models to DSP format for embedded use. * When the spotter is used in slot `1` of the [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type) spotter template model. In all other cases spotters listen continuously, regardless of the value of `listen-window`. **Also see these related items:** [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type), [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop), [What is a Command Set?](https://doc.sensory.com/tnl/7.8/faq.md#use-command-set) ### low-fr-operating-point - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_LOW_FR_OPERATING_POINT, value); ``` **Java** ```java s.setInt(Snsr.LOW_FR_OPERATING_POINT, value); ``` **Python** ```python s.set_int(snsr.LOW_FR_OPERATING_POINT, value) ``` Low false-reject spotter operating point. Selects the low false-reject fall-back operating point used by smart wake words. This low false-reject operating point is selected for [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms) if a spot was rejected at [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) but would have been accepted at `low-fr-operating-point`. **Also see these related items:** [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) ### operating-point - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_OPERATING_POINT, value); ``` **Java** ```java s.setInt(Snsr.OPERATING_POINT, value); ``` **Python** ```python s.set_int(snsr.OPERATING_POINT, value) ``` Spotter operating point. Selects the trade-off between false accept and false reject errors for wake word and command set recognizers. **Higher-numbered points are more accepting.** * The valid range is from `1` to `21` inclusive. * Lower-numbered points have a lower false accept rate at the expense of higher false reject fraction. * The false accept rate is expressed as the expected number of false accepts (where the recognizer mistakenly spots the trigger phrase) per time unit. For example, 1.2 false accepts per day. * The false reject rate is the percentage of times the actual trigger phrase is spoken, but not recognized. For example, 4.5%. * The default operating point is selected by Sensory during trigger development for a good balance between the these two error types. * Not all operating points are necessarily valid. Use [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator) to find all the [available points](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#available-point). **Also see these related items:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms) ### score-offset - configuration - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_SCORE_OFFSET, value); ``` **Java** ```java s.setDouble(Snsr.SCORE_OFFSET, value); ``` **Python** ```python s.set_double(snsr.SCORE_OFFSET, value) ``` **Reserved:** Do not use unless recommended by Sensory. ### sv-threshold - configuration - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_SV_THRESHOLD, value); ``` **Java** ```java s.setDouble(Snsr.SV_THRESHOLD, value); ``` **Python** ```python s.set_double(snsr.SV_THRESHOLD, value) ``` Enrolled wake word speaker verification threshold. Enrolled wake word results with a [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) less than this threshold are not reported. Increase this threshold to reduce the chance that someone other than the enrolled speaker triggers the phrase spotter. **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) ### threshold - configuration - int - read-write - deprecated [7.4.0](https://doc.sensory.com/tnl/7.8/changes/index.md#v7.4.0) **C/C++** ```c snsrSetInt(s, SNSR_THRESHOLD, value); ``` **Java** ```java s.setInt(Snsr.THRESHOLD, value); ``` **Python** ```python s.set_int(snsr.THRESHOLD, value) ``` Dynamic operating point selection threshold. **Deprecated:** Superseded by built-in support for smart wake words in TrulyNatural 7.4.0. Selects the threshold used by `tpl-spot-dynop-1.4.0.snsr` to decide whether to select the [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point). **Also see these related items:** [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) ## LVCSR & STT ### ac-prune-top-k - configuration - int - read-write _(TrulyNatural only)_ **C/C++** ```c snsrSetInt(s, SNSR_AC_PRUNE_TOP_K, value); ``` **Java** ```java s.setInt(Snsr.AC_PRUNE_TOP_K, value); ``` **Python** ```python s.set_int(snsr.AC_PRUNE_TOP_K, value) ``` Reduce LVCSR decoder CPU use This setting trades CPU use for recognition accuracy. A subset recognizers optimized for low resource use created by [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) allow reducing the CPU cycles used in search decoding at the expense of an increased recognition error rate. Set to `0` to disable. ### am-size - configuration - double - read-only - STT only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_AM_SIZE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_AM_SIZE); ``` **Python** ```python value = s.get_double(snsr.RES_AM_SIZE) ``` Size of STT acoustic model, in bytes. **Note:** Not supported for all STT models. **Also see these related items:** [lm-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#lm-size), [nlu-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#nlu-size), [slm-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slm-size) ### backlog-interval - configuration - double - read-write _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrSetDouble(s, SNSR_BACKLOG_INTERVAL, value); ``` **Java** ```java s.setDouble(Snsr.BACKLOG_INTERVAL, value); ``` **Python** ```python s.set_double(snsr.BACKLOG_INTERVAL, value) ``` Partial result update interval used while processing an audio backlog. This setting overrides [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval) when recognizing audio that precedes a wake word, enabled by setting [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end) to `1` or `2`. By default `backlog-interval = 0` for the lowest recognition result latency. **Also see these related items:** [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type), [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type) ### complete-only - configuration - int - read-write - TrulyNatural only **C/C++** ```c snsrSetInt(s, SNSR_COMPLETE_ONLY, value); ``` **Java** ```java s.setInt(Snsr.COMPLETE_ONLY, value); ``` **Python** ```python s.set_int(snsr.COMPLETE_ONLY, value) ``` Controls whether incomplete LVCSR results are accepted. The [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) result available in the [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) callback for LVCSR recognizers reports the recognition result that best matches the acoustic evidence the recognizer saw. With `complete-only = 0` the behavior is to show incomplete results, even if they are not accepted by the grammar specification. For example, if a custom recognizer uses ``` grammar = ~~1 2 3 4 5 6 7 8 9 10~~ ; ``` and the audio contains only "1 2 3 4", then the final result will be "1 2 3 4". If this behavior is not desirable, setting `complete-only = 1` will suppress such incomplete results. The [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) callback will still happen, but [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) will be ``. The [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) events will not be invoked. If `complete-only` isn't explicitly set it defaults to `1`. **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) ### custom-vocab - configuration - string - read-write - STT only **C/C++** ```c snsrSetString(s, SNSR_CUSTOM_VOCAB, "text"); ``` **Java** ```java s.setString(Snsr.CUSTOM_VOCAB, "text"); ``` **Python** ```python s.set_string(snsr.CUSTOM_VOCAB, "text") ``` Custom STT vocabulary. STT recognizers occasionally do not have full vocabulary coverage for low-frequency words, proper names, trade marks, and such. Use this custom vocabulary setting to add new words to a recognizer. **Note:** Use custom vocabulary to address minor recognition issues. For more than a couple of hundred entries you'll get better performance with a domain-specific STT model. Contact your Sensory sales representative to explore options. Map format: ``` output word or phrase [, incorrect result [, incorrect result [, ...]]] ... ``` * New vocabulary word or phrase, * followed by zero or more mis-recognized examples, each prefixed with a `,` separator. * Vocabulary entries are separated by `\r`, `\n` or `;` **Example custom vocabulary:** **`custom-vocab.txt`** ```sh voice genie #(1)! voice genie, voice jenny #(2)! armadillo, i'm adello, amadello #(3)! ``` 1. New vocabulary phrase, without any mis-recognized alternates. If "voice genie" is one of the alternates the recognizer is considering this will increase the likelihood that it is selected as the result. 2. If the STT engine were to recognize "voice jenny" it will be rewritten to "voice genie" 3. If the STT engine recognizes "i'm adello" or "amadello" these will both be rewritten as "armadillo" in the result. **Example:** ```console % snsr-eval -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \ -s partial-result-interval=0 \ data/enrollments/armadillo-1-4-c.wav NLU intent: no_command = an anlla record a video 400 1720 an anlla record a video % snsr-eval -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \ -s partial-result-interval=0 \ -s 'custom-vocab="armadillo, an anlla; jackalope"' \ data/enrollments/armadillo-1-4-c.wav NLU intent: no_command = armadillo record a video 400 1720 armadillo record a video ``` ### lm-size - configuration - double - read-only - STT only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_LM_SIZE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_LM_SIZE); ``` **Python** ```python value = s.get_double(snsr.RES_LM_SIZE) ``` Size of STT language model, in bytes. **Note:** Not supported for all STT models. **Also see these related items:** [am-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#am-size), [nlu-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#nlu-size), [slm-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slm-size) ### partial-result-interval - configuration - double - read-write _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrSetDouble(s, SNSR_PARTIAL_RESULT_INTERVAL, value); ``` **Java** ```java s.setDouble(Snsr.PARTIAL_RESULT_INTERVAL, value); ``` **Python** ```python s.set_double(snsr.PARTIAL_RESULT_INTERVAL, value) ``` Partial result update interval. The current preliminary result is emitted every `partial-result-interval` milliseconds. Set to `0` to disable partial result reporting. **Warning:** Do not change `partial-result-interval` from an event handler, or while a model is running. Using `partial-result-interval` values below 120 ms adds significant CPU overhead. **Note:** In STT models this also sets the interval at which the model is evaluated. Less frequent updates trade preliminary result latency for lower average CPU use. Set to `0` for the lowest possible evaluation rate and CPU use. **Also see these related items:** [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) ### ram-limit - configuration - double - read-write _(TrulyNatural only)_ **C/C++** ```c snsrSetDouble(s, SNSR_RAM_LIMIT, value); ``` **Java** ```java s.setDouble(Snsr.RAM_LIMIT, value); ``` **Python** ```python s.set_double(snsr.RAM_LIMIT, value) ``` Limit LVCSR decoder memory use The amount of heap RAM to allocate to LVCSR search decoding, in bytes. A subset recognizers optimized for low resource use created by [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) allow limiting the amount of heap RAM to allocate to search decoding. This setting modifies this limit. Lower values can increase error rates, so set this to as large a value as constraints allow. Set to `0` to disable the limit. **Also see these related items:** [ac-prune-top-k](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#ac-prune-top-k) ### result-max - configuration - int - read-write - TrulyNatural only **C/C++** ```c snsrSetInt(s, SNSR_RESULT_MAX, value); ``` **Java** ```java s.setInt(Snsr.RESULT_MAX, value); ``` **Python** ```python s.set_int(snsr.RESULT_MAX, value) ``` The maximum number of alternate phrase results to consider Limits the number of alternate phrases returned by LVCSR models. If `result-max > 1`, [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator) will return phrase-level recognition results in order of likelihood. The default is `result-max == 1`, which returns only the most likely result. Limitations - [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) and [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator) are available for the most likely result only. - Time alignments are accurate for the most likely result only. - [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) values are not usable when `result-max > 1`. - Silence markup is elided from all but the top scoring phrase. An empty [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) result indicates that silence was the best match to the acoustic input. **Warning:** N-best processing is computationally expensive, frequently prohibitively so. Contact Sensory for guidance before using this feature in production. **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator) ### search.frame-nota - configuration - double - read-write - TrulyNatural only **C/C++** ```c snsrSetDouble(s, SNSR_OOV_REJECT, value); ``` **Java** ```java s.setDouble(Snsr.OOV_REJECT, value); ``` **Python** ```python s.set_double(snsr.OOV_REJECT, value) ``` Out-of-vocabulary rejection sensitivity. This setting controls out-of-vocabulary rejection in custom LVCSR recognizers. Custom LVCSR recognizers report `` for words or phrases that are not in the grammar. With an `search.frame-nota` value of `0` the recognizer will never report ``, it will return the closest match instead. With `search.frame-nota` at `1.0`, almost all input will return ``. The optimal value for `search.frame-nota` depends on the vocabulary used. A reasonable value to start testing with is `0.2`. **Note:** Do not change `search.frame-nota` for models that include statistical language model components. These models typically have either `-broad-` or `-background-` in the model name, and are configured to use the language model to recognize utterances not covered by the custom grammar. **Also see these related items:** [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream) ### show-silence - configuration - int - read-write - TrulyNatural only **C/C++** ```c snsrSetInt(s, SNSR_SHOW_SILENCE, value); ``` **Java** ```java s.setInt(Snsr.SHOW_SILENCE, value); ``` **Python** ```python s.set_int(snsr.SHOW_SILENCE, value) ``` Include silence in recognizer results. When set to `1`, LVCSR recognition results include word-pause ``, sentence-begin `~~`, and sentence-end `~~` markup. The default value is `0`, which elides these from results. **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) ### stt-profile - configuration - string - read-write _(STT only)_ **C/C++** ```c snsrSetString(s, SNSR_STT_PROFILE, "text"); ``` **Java** ```java s.setString(Snsr.STT_PROFILE, "text"); ``` **Python** ```python s.set_string(snsr.STT_PROFILE, "text") ``` Select STT speed vs accuracy trade-off. Default value is `accurate`, set to `fast` to reduce CPU load at the expense of recognition accuracy. **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) ## NLU & SLM ### nlu-match-max - configuration - int - read-write - TrulyNatural only **C/C++** ```c snsrSetInt(s, SNSR_NLU_RES_MAX, value); ``` **Java** ```java s.setInt(Snsr.NLU_RES_MAX, value); ``` **Python** ```python s.set_int(snsr.NLU_RES_MAX, value) ``` The maximum number of alternate NLU matches to consider Limits the number of [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) callbacks issued in case of multiple valid NLU matches to the recognition result. The default value is `1`, limiting NLU results to the best-scoring match only. **Also see these related items:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [nlu-match-index](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-match-index), [nlu-slot-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-count) ### nlu-size - configuration - double - read-only - STT only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_NLU_SIZE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_NLU_SIZE); ``` **Python** ```python value = s.get_double(snsr.RES_NLU_SIZE) ``` Size of STT NLU model, in bytes. **Note:** Not supported for all STT models. **Also see these related items:** [am-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#am-size), [lm-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#lm-size), [slm-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slm-size) ### slm-enabled - configuration - int - read-write _(STT only)_ **C/C++** ```c snsrSetInt(s, SNSR_SLM_ENABLED, value); ``` **Java** ```java s.setInt(Snsr.SLM_ENABLED, value); ``` **Python** ```python s.set_int(snsr.SLM_ENABLED, value) ``` Enable optional SLM component. Set to `0` to turn the SLM component off, `1` to turn on. **Also see these related items:** [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [slm-turn-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slm-turn-limit) ### slm-size - configuration - double - read-only _(STT only)_ **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SLM_SIZE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SLM_SIZE); ``` **Python** ```python value = s.get_double(snsr.RES_SLM_SIZE) ``` Size of STT SLM, in bytes. **Note:** Not supported for all STT models. **Also see these related items:** [am-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#am-size), [lm-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#lm-size), [nlu-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#nlu-size) ### slm-turn-limit - configuration - int - read-write _(STT only)_ **C/C++** ```c snsrSetInt(s, SNSR_SLM_TURN_LIMIT, value); ``` **Java** ```java s.setInt(Snsr.SLM_TURN_LIMIT, value); ``` **Python** ```python s.set_int(snsr.SLM_TURN_LIMIT, value) ``` Configure SLM history behavior. If `slm-turn-limit >= 0` the optional SLM component limits the number of conversational turns in the model history. The default `-1`, which keeps all history. Writing to `slm-turn-limit` discards existing history. **Note:** Values larger than `0` increases the SLM result latency and CPU use. **Also see these related items:** [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [slm-enabled](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slm-enabled) ## Templates & slots ### 0. - configuration - stream - read-write **C/C++** ```c snsrSetStream(s, SNSR_SLOT_0, stream); ``` **Java** ```java s.setStream(Snsr.SLOT_0, stream); ``` **Python** ```python s.set_stream(snsr.SLOT_0, stream) ``` Template slot `0`. The first slot in a template task. Template slots expect a [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) opened on a `.snsr` model file. You can also use this string value as an argument with [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot). **Also see these related items:** [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) ### 1. - configuration - stream - read-write **C/C++** ```c snsrSetStream(s, SNSR_SLOT_1, stream); ``` **Java** ```java s.setStream(Snsr.SLOT_1, stream); ``` **Python** ```python s.set_stream(snsr.SLOT_1, stream) ``` Template slot `1`. The second slot in a template task. Template slots expect a [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream) opened on a `.snsr` model file. You can also use this string value as an argument with [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot). **Also see these related items:** [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) ### include-model - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_INCLUDE_MODEL, value); ``` **Java** ```java s.setInt(Snsr.INCLUDE_MODEL, value); ``` **Python** ```python s.set_int(snsr.INCLUDE_MODEL, value) ``` Debug log includes a copy of the model. This boolean value controls whether the [debug-log-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#debug-log-file) includes a copy of the task model (the `.snsr` file). The default value is `1`. Set `include-model=0` for smaller (but less complete) debug log files. **Also see these related items:** [debug-log-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#debug-log-file) ### loop - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_LOOP, value); ``` **Java** ```java s.setInt(Snsr.LOOP, value); ``` **Python** ```python s.set_int(snsr.LOOP, value) ``` Control template looping behavior. In [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type), setting this value to `1` changes _when_ the listening focus returns to slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0). Instead of immediately returning to slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) after a spot in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), it resets the expiration timer, and only a [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).[listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window) timeout returns to slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0). This allows for a wake word followed by zero or more commands from a command set. The default behavior (`loop = 0`) is to allow at most one command before requiring another wake word utterance. Setting `loop = 2` pins the listening focus to slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). Use this, for example, if an application needs to gate a command set recognizer with a wake word or an external event such as a push-to-talk button. **Also see these related items:** [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window) ### slot - configuration - string - read-write **C/C++** ```c snsrSetString(s, SNSR_SLOT, "text"); ``` **Java** ```java s.setString(Snsr.SLOT, "text"); ``` **Python** ```python s.set_string(snsr.SLOT, "text") ``` Template slot selector. Use with [tpl-spot-select](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select.md#tpl-spot-select-type) and [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) to select the active slot. **Also see these related items:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot), [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) ### wake-word-at-end - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_WAKE_WORD_AT_END, value); ``` **Java** ```java s.setInt(Snsr.WAKE_WORD_AT_END, value); ``` **Python** ```python s.set_int(snsr.WAKE_WORD_AT_END, value) ``` Support for trailing wake words. Setting this to `1` or `2` enables support for recognizing utterances gated by a wake word at the end of an utterance, in addition to gating by a wake word at the start. `1` does not include the wake word audio, but `2` does. `0` turns this feature off. **Note:** This feature requires an [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size) large enough to hold the entire utterance. Enabling `wake-word-at-end` will increase this to ten seconds if it starts out smaller. If the utterance before the wake word does not fit into the `audio-stream-size` ring buffer, the VAD will invoke the [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) event instead of [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) **Also see these related items:** [backlog-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backlog-interval), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type), [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type), [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type), [use-trailing-wake-word](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#use-trailing-wake-word) ## Enrollment & adaptation ### accuracy - configuration - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_ACCURACY, value); ``` **Java** ```java s.setDouble(Snsr.ACCURACY, value); ``` **Python** ```python s.set_double(snsr.ACCURACY, value) ``` Enrollment accuracy. Trades accuracy of the enrolled model for enrollment speed. The default accuracy is `1.0`, for the best accuracy at the slowest enrollment speed. Valid range is `0.0` to `1.0`. **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled) ### ctx-enroll - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_ENROLLMENT_CONTEXT, value); ``` **Java** ```java s.setInt(Snsr.ENROLLMENT_CONTEXT, value); ``` **Python** ```python s.set_int(snsr.ENROLLMENT_CONTEXT, value) ``` Number of enrollments with trailing context. The recommended number of enrollments where the phrase is followed by additional speech. For example: "Hey Sensory will it rain tomorrow?" ### enrollment-task-index - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_ENROLLMENT_TASK_INDEX, value); ``` **Java** ```java s.setInt(Snsr.ENROLLMENT_TASK_INDEX, value); ``` **Python** ```python s.set_int(snsr.ENROLLMENT_TASK_INDEX, value) ``` The index of the sub-task to enroll. For enrollment tasks that contain multiple sub-tasks (for example, a user-defined trigger and an enrolled fixed trigger), this integer value selects which of the sub-tasks the enrollments should be applied to. See the documentation delivered with the task file for the sub-task mapping. **Note:** For most enrollment tasks the only supported task index is `0`. ### interactive - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_INTERACTIVE_MODE, value); ``` **Java** ```java s.setInt(Snsr.INTERACTIVE_MODE, value); ``` **Python** ```python s.set_int(snsr.INTERACTIVE_MODE, value) ``` Interactive enrollment mode. This changes the enrollment task behavior: When set to `0`, enrollment for the current phrase will continue until the end of the stream. **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled) ### max-users - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_MAX_USERS, value); ``` **Java** ```java s.setInt(Snsr.MAX_USERS, value); ``` **Python** ```python s.set_int(snsr.MAX_USERS, value) ``` Maximum number of users to adapt to. Sets a limit to the number of distinct users a continuously adapting fixed-phrase spotter will enroll. ### req-enroll - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_ENROLLMENT_TARGET, value); ``` **Java** ```java s.setInt(Snsr.ENROLLMENT_TARGET, value); ``` **Python** ```python s.set_int(snsr.ENROLLMENT_TARGET, value) ``` Enrollment target. The recommended number of enrollments for each user. Using either more or fewer enrollments will reduce overall spotter performance. **Also see these related items:** [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [enrollment-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#enrollment-count) ### save-enroll-audio - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_SAVE_ENROLLMENT_AUDIO, value); ``` **Java** ```java s.setInt(Snsr.SAVE_ENROLLMENT_AUDIO, value); ``` **Python** ```python s.set_int(snsr.SAVE_ENROLLMENT_AUDIO, value) ``` Include enrollment audio in the enrollment context. Set to `1` to include the enrollment audio in enrollment contexts, `0` to exclude. **Also see these related items:** [RUNTIME](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_runtime), [enrollment-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#enrollment-iterator) ### user - configuration - string - read-write **C/C++** ```c snsrSetString(s, SNSR_USER, "text"); ``` **Java** ```java s.setString(Snsr.USER, "text"); ``` **Python** ```python s.set_string(snsr.USER, "text") ``` Enrolling user tag. Sets the tag for the current enrollment. This should be a unique alphanumeric phrase, without spaces. It is the phrase returned as a recognition result. If enrolling more than one phrase for any of the users, the tag *must* contain one `/` that separates a user-specific part from the phrase part. For example: `user1/phrase1`, `user2/phrase1`, `user2/phrase2`. **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled) ## Push / pull execution ### push-buffer-backlog - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_RES_PUSH_BUFFER_BACKLOG, value); ``` **Java** ```java s.setInt(Snsr.RES_PUSH_BUFFER_BACKLOG, value); ``` **Python** ```python s.set_int(snsr.RES_PUSH_BUFFER_BACKLOG, value) ``` Reports the number of bytes of deferred [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) data. If [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) is used with a [push-duration-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-duration-limit), this setting reports the number of bytes deferred for processing in subsequent calls to [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push). **Also see these related items:** [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push), [push-buffer-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-buffer-size), [push-duration-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-duration-limit) ### push-buffer-size - configuration - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_PUSH_BUFFER_SIZE, value); ``` **Java** ```java s.setInt(Snsr.PUSH_BUFFER_SIZE, value); ``` **Python** ```python s.set_int(snsr.PUSH_BUFFER_SIZE, value) ``` The size of the internal ring buffers used by [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push). If [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) is used with a [push-duration-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-duration-limit), processing will require deferral if the duration limit is reached. In this case, [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) will allocate a ring buffer to hold these data. This setting configures the size of this buffer, in bytes. The default buffer size is sufficient to defer up to 250 ms of audio data. **Also see these related items:** [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push), [push-duration-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-duration-limit), [push-buffer-backlog](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-buffer-backlog) ### push-duration-limit - configuration - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_PUSH_DURATION_LIMIT, value); ``` **Java** ```java s.setDouble(Snsr.PUSH_DURATION_LIMIT, value); ``` **Python** ```python s.set_double(snsr.PUSH_DURATION_LIMIT, value) ``` Sets a limit to the maximum processing time [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) should consume. This setting is the maximum number of milliseconds any call to [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) should spend processing data before returning control to the caller. The default value is `0`, which disables the processing limit. **Note:** This requires a valid real-time clock function, see [CONFIG_CLOCK_FUNC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_clock_func). TrulyNatural SDK libraries for Android, Linux, macOS, iOS, C, Java, and Python include real-time clock functions and require no additional configuration. You should use a `push-duration-limit` if: * You're using [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push), and * you collect live audio on the same thread as the recognizer, and * you will drop audio packets if you don't return from [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) before the next packet is available. `push-duration-limit` adds a cap to the amount of CPU used in each call to [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push). This requires and allocates an additional input ring buffer that's [push-buffer-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-buffer-size) bytes in size. If you have a separate thread, or interrupt-driven live audio recording and you want to maximize throughput, increase the size of the audio ring buffer instead of using a `push-duration-limit`. Recommendations: * Use 15 ms audio chunks. * The audio recording buffer size determines the longest time the average recognizer throughput can fall behind real time. * With a 30 ms buffer only two 15 ms blocks fit, which means that every SDK processing call must return within 15 ms, or a block or partial block will be lost. * Using a 300 ms buffer relaxes this. 20 blocks mean that the recognizer can fall up to 18 blocks (270 ms) behind before losing audio. **Also see these related items:** [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push), [push-buffer-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-buffer-size), [push-buffer-backlog](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-buffer-backlog), [CLOCK_FUNC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_clock_func) ## Logging & diagnostics ### cache-file - configuration - string - read-write **C/C++** ```c snsrSetString(s, SNSR_CACHE_FILE, "text"); ``` **Java** ```java s.setString(Snsr.CACHE_FILE, "text"); ``` **Python** ```python s.set_string(snsr.CACHE_FILE, "text") ``` Continuous Adaptation cache file name. When set, enrolled user data will be saved to, and loaded from this file. If not set, enrolled user data are discarded when the spotter session is released. This setting is only available in fixed-phrase spotters that support continuous adaptation. If you need more control over how or when the enrollment context is saved you can do this from the [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) callback handler. **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) ### debug-log-file - configuration - string - read-write **C/C++** ```c snsrSetString(s, SNSR_DEBUG_LOG_FILE, "text"); ``` **Java** ```java s.setString(Snsr.DEBUG_LOG_FILE, "text"); ``` **Python** ```python s.set_string(snsr.DEBUG_LOG_FILE, "text") ``` Debug log filename. The name of the log file [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug.md#tpl-spot-debug-type) writes to. This value is required, and no default is defined in the template. The directory the log file is in must exist, and must be writable. These optional and mutually exclusive character sequences are substituted with the time stamp when the log file is first opened: * `%@` - year-month-day_hour-minute-second.milliseconds (UTC) * `%#` - milliseconds since the [epoch][]. **Example:** **C/C++** ```c snsrSetString(session, SNSR_DEBUG_LOG_FILE, "debug-%#.log"); ``` **Java** ```java session.setString(Snsr.DEBUG_LOG_FILE, "debug-%#.log"); ``` **Python** ```python session.set_string(snsr.DEBUG_LOG_FILE, "debug-%#.log") ``` ### fex-hash - configuration - string - read-only - pre-release **C/C++** ```c const char * value; snsrGetString(s, SNSR_FEATURE_HASH, &value); ``` **Java** ```java String value = s.getString(Snsr.FEATURE_HASH); ``` **Python** ```python value = s.get_string(snsr.FEATURE_HASH) ``` Feature extractor hash. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. This is a unique string that identifies the feature type used by the task. ## Identity & metadata ### task-name - configuration - string - read-only - deprecated [6.14.0](https://doc.sensory.com/tnl/7.8/changes/version-6.md#v6.14.0) **C/C++** ```c const char * value; snsrGetString(s, SNSR_TASK_NAME, &value); ``` **Java** ```java String value = s.getString(Snsr.TASK_NAME); ``` **Python** ```python value = s.get_string(snsr.TASK_NAME) ``` Task name. **Deprecated:** Support for this setting will be removed from the next major release of this SDK. **Do not use this in new code.** ### task-type - configuration - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_TASK_TYPE, &value); ``` **Java** ```java String value = s.getString(Snsr.TASK_TYPE); ``` **Python** ```python value = s.get_string(snsr.TASK_TYPE) ``` Task type. This, together with [task-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-version), describes the model behavior: Which [setting keys](https://doc.sensory.com/tnl/7.8/api/setting-keys/index.md#setting-keys) and [streams](https://doc.sensory.com/tnl/7.8/api/io.md#stream) it supports. Examples include: [enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll), [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot), [phrasespot-vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot-vad), and [vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad). **Also see these related items:** [Values](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#values), [task-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-version), [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require) ### task-type-and-version-list - configuration - string - write-only **C/C++** ```c snsrSetString(s, SNSR_TASK_TYPE_AND_VERSION_LIST, "text"); ``` **Java** ```java s.setString(Snsr.TASK_TYPE_AND_VERSION_LIST, "text"); ``` **Python** ```python s.set_string(snsr.TASK_TYPE_AND_VERSION_LIST, "text") ``` Verifies that a model matches one of list of types and versions. When used with [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), the value argument must be a semicolon-separated list of [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) and [task-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-version) values. This list must have at least one element. A task will match the requirement if one of the [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) fields match, and the corresponding [task-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-version) is satisfied. If no [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) matches, [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require) returns [REQUIRE_MISMATCH](https://doc.sensory.com/tnl/7.8/api/inference.md#rc). If a [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) matches, but the associated [task-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-version) is not satisfied, [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require) returns [VERSION_MISMATCH](https://doc.sensory.com/tnl/7.8/api/inference.md#rc). **Example:** **C/C++** ```c snsrRequire(session, SNSR_TASK_TYPE_AND_VERSION_LIST, SNSR_PHRASESPOT " ~0.5.0 || 1.0.0;" SNSR_LVCSR " 1.0.0"); ``` **Java** ```java session.require(Snsr.TASK_TYPE_AND_VERSION_LIST, Snsr.PHRASESPOT + " ~0.5.0 || 1.0.0;" + Snsr.LVCSR + " 1.0.0"); ``` **Python** ```python session.require( snsr.TASK_TYPE_AND_VERSION_LIST, f"{snsr.PHRASESPOT} ~0.5.0 || 1.0.0;{snsr.LVCSR} 1.0.0", ) ``` **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require) ### task-version - configuration - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_TASK_VERSION, &value); ``` **Java** ```java String value = s.getString(Snsr.TASK_VERSION); ``` **Python** ```python value = s.get_string(snsr.TASK_VERSION) ``` Model task version. These version strings follow [semantic versioning](http://semver.org/) rules. **Also see these related items:** [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require) [epoch]: https://en.wikipedia.org/wiki/Unix_time "Unix time" [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[RAM]: Random Access Memory *[SDK]: Software Development Kit *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/setting-keys/events.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/events/" --- # Events Use [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) with keys in this section to register callback handlers for events. The values for these settings refer to runtime instances of code objects and are not serialized by [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) or [dup](https://doc.sensory.com/tnl/7.8/api/inference.md#dup). The [• results](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#results) and [• iterators](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#iterators) available in these handlers vary by event. See the descriptions below for details on what these are. [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) is the most used event by a large margin. ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## Audio I/O ### ^sample-count - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_SAMPLES_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.SAMPLES_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.SAMPLES_EVENT, on_event) ``` Samples available event. Raised whenever audio samples have been read from the input stream and are about to be processed. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ ## VAD & endpointing ### ^begin - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_BEGIN_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.BEGIN_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.BEGIN_EVENT, on_event) ``` Begin point detected VAD event. Raised when speech has been detected. Use [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms) or [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample) to retrieve the start point relative to the beginning of [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) ### ^end - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_END_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.END_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.END_EVENT, on_event) ``` Endpoint detected VAD event. Raised [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence) ms after end-of-speech has been detected. Use [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms) or [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample) to retrieve the endpoint relative to the beginning of [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence) ### ^limit - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_LIMIT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.LIMIT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.LIMIT_EVENT, on_event) ``` Maximum recording reached VAD event. Raised when [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms of speech has been processed by the VAD before a trailing-silence endpoint is found. Use [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms) or [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample) to retrieve the endpoint relative to the beginning of [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ### ^silence - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_SILENCE_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.SILENCE_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.SILENCE_EVENT, on_event) ``` No speech detected event. Raised if no speech is detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms from the start of [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), adjusted by [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms) or [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample) **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms), [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample) ## Wake word & command set ### ^result - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_RESULT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.RESULT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.RESULT_EVENT, on_event) ``` Recognition result available event. Raised when a final recognition hypothesis is available. **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [confidence-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#confidence-score), [domain](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#domain), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [id](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#id), [noise-energy](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#noise-energy), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [signal-energy](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#signal-energy), [snr](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#snr), [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) **Also see these related items:** [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) ## Templates & slots ### ^listen-begin - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_LISTEN_BEGIN_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.LISTEN_BEGIN_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.LISTEN_BEGIN_EVENT, on_event) ``` Sequential task has started listening on second slot. Raised in a sequential task when audio focus has shifted from the first slot to the second. This typically happens when the spotter n the first slot has detected the phrase. This event is ignored for tasks that do not feature sequential behavior. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) ### ^listen-end - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_LISTEN_END_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.LISTEN_END_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.LISTEN_END_EVENT, on_event) ``` Sequential task has stopped listening on second slot. Raised in a sequential task when audio focus has shifted from the second slot back to the first. This event is ignored for tasks that do not feature sequential behavior. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) ## LVCSR & STT ### ^result - event - handle - write-only Documented under **Wake word & command set** ([^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result)). Emitted by every recognizer with a final result. ### ^result-partial - event - handle - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrSetHandler(s, SNSR_PARTIAL_RESULT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.PARTIAL_RESULT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.PARTIAL_RESULT_EVENT, on_event) ``` Partial recognition result available event. Raised when a preliminary recognition hypothesis is available. **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [domain](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#domain), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) ## NLU & SLM ### ^nlu-intent - event - handle - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrSetHandler(s, SNSR_NLU_INTENT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.NLU_INTENT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.NLU_INTENT_EVENT, on_event) ``` NLU intent available. This event is raised when a natural language parse result is available. Each intent that matched input generates an `^nlu-intent` event. Use [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) in this callback handler to retrieve the names of entities found. An intent is an action, such as turning on the windshield wipers, or setting a microwave clock. Intents are exactly the same as the top-level NLU slots reported by [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot). The entities available inside the handler for this event, however, are only one level deep. The SDK flattens nested [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name)s by separating each level with a period. For example: Grammar snippet `{ date { time {hours} {minutes} } }` has intent `date` with entities `time.hours` and `time.minutes`, but slot `date` has a single child slot `time`, which in turn has two children, `hours` and `minutes`. See the [NLU section](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#nlu-markup) in the [grammar syntax](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) section for more detail on the `{}` NLU slot capturing operator. **Available results:** [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name), [nlu-intent-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-score), [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) **Also see these related items:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [grammar syntax](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) ### ^nlu-slot - event - handle - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrSetHandler(s, SNSR_NLU_SLOT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.NLU_SLOT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.NLU_SLOT_EVENT, on_event) ``` NLU result available. This event is raised when a lightweight natural language parse result is available. Each top-level slot that matched input generates a `^nlu-slot` event. Use [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) in this callback handler to retrieve child slot names and values. See the [NLU section](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#nlu-markup) in the [grammar syntax](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) section for more detail on the `{}` NLU slot capturing operator. **Available results:** [nlu-match-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-match-count), [nlu-slot-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-count), [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name), [nlu-slot-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-score), [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) **Also see these related items:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [grammar syntax](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) ### ^slm-result - event - handle - write-only _(STT only)_ **C/C++** ```c snsrSetHandler(s, SNSR_SLM_RESULT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.SLM_RESULT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.SLM_RESULT_EVENT, on_event) ``` SLM final output available This event is raised when an optional SLM has a complete result available. In the handler for this callback [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) holds the entire SLM response. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ **Also see these related items:** [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) ### ^slm-result-partial - event - handle - write-only _(STT only)_ **C/C++** ```c snsrSetHandler(s, SNSR_SLM_PARTIAL_RESULT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.SLM_PARTIAL_RESULT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.SLM_PARTIAL_RESULT_EVENT, on_event) ``` SLM partial output available This event is raised when an optional SLM has a new word prediction available. In the handler for this callback [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) holds the next word prediction. Return [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop) from the callback to abort further SLM generation; this will not stop the overall recognition pipeline. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ **Also see these related items:** [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) ### ^slm-start - event - handle - write-only _(STT only)_ **C/C++** ```c snsrSetHandler(s, SNSR_SLM_START_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.SLM_START_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.SLM_START_EVENT, on_event) ``` SLM processing is about to start Return [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop) from the callback handler to avoid doing SLM processing. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) ## Enrollment & adaptation ### ^adapt-started - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_ADAPT_STARTED_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.ADAPT_STARTED_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.ADAPT_STARTED_EVENT, on_event) ``` Wake word model adaptation thread has started. This handler is called from a thread started to do model adaptation in continuously adapting spotter models. The handle function should return [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) to continue with model adaptation, or [SKIP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_skip) to abort adaptation and ignore this enrollment without raising an error. The handler function's [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) argument will be `NULL` as it is not safe to access this handle from the enrollment thread. You can use this handler to adjust the adaptation thread priority, or delay the start of adaptation by blocking on a condition variable. **Note:** [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) teardown waits on joining the adaptation thread. Any handler function registered for this event must return before the associated [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) is released. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) ### ^adapted - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_ADAPTED_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.ADAPTED_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.ADAPTED_EVENT, on_event) ``` Wake word adaptation complete event. Adaptation is the estimation of spotter model parameters from a number of enrollments for one or more users. **Example:** Add a handler for this event to save adapted enrollment contexts. **C/C++** ```c snsrSave(session, SNSR_FM_RUNTIME, snsrStreamFromFileName("context.snsr", "w")); ``` **Java** ```java session.save(Snsr.FM_RUNTIME, SnsrStream.fromFileName("context.snsr", "w")); ``` **Python** ```python session.save( snsr.DataFormat.RUNTIME, snsr.Stream.from_filename("context.snsr", "w"), ) ``` To load a previously saved enrollment context, do so _after_ loading the model that created it. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user), [user-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#user-count) **Available iterators:** [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator) **Also see these related items:** [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled), [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save), [RUNTIME](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_runtime) ### ^done - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_DONE_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.DONE_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.DONE_EVENT, on_event) ``` Wake word adaptation complete event. Handler called when adaptation has completed and the [model-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#model-stream) model is available. This handler typically returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop) to end processing. **Available results:** [model-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#model-stream), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ ### ^enrolled - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_ENROLLED_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.ENROLLED_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.ENROLLED_EVENT, on_event) ``` Wake word enrollment complete event. Handler called when enrollment is complete, just before model adaptation starts. Use this handler for the earliest possible access to the enrollment context. Enrollment is the process of taking individual enrollments (audio files or live audio clips), validating them, and adding them to the model. These enrollments are used during the adaptation stage to estimate spotter model parameters. **Example:** Add a handler for this event to save enrollment contexts before they are adapted. **C/C++** ```c snsrSave(session, SNSR_FM_RUNTIME, snsrStreamFromFileName("enrollments.snsr", "w")); ``` **Java** ```java session.save(Snsr.FM_RUNTIME, SnsrStream.fromFileName("enrollments.snsr", "w")); ``` **Python** ```python session.save( snsr.DataFormat.RUNTIME, snsr.Stream.from_filename("enrollments.snsr", "w"), ) ``` **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done), [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save), [RUNTIME](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_runtime) ### ^fail - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_FAIL_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.FAIL_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.FAIL_EVENT, on_event) ``` Wake word enrollment failed event. Handler called if an enrollment fails quality checks. **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [enrollment-id](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#enrollment-id), [reason](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason), [reason-guidance](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-guidance), [reason-pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-pass), [reason-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-threshold), [reason-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-value), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** [reason-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#reason-iterator) **Also see these related items:** [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass) ### ^new-user - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_NEW_USER_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.NEW_USER_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.NEW_USER_EVENT, on_event) ``` New user detected event. Handler called when a continuously adapting spotter model has adapted to a new user. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user), [user-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#user-count) **Available iterators:** [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator) **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) ### ^next - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_NEXT_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.NEXT_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.NEXT_EVENT, on_event) ``` Next user event. Handler called during interactive enrollment to prompt the end-user for a new enrollment phrase. This handler should set [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) to the tag associated with the new user. Leave [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) if there are no more users or phrases to enroll. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) ### ^pass - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_PASS_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.PASS_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.PASS_EVENT, on_event) ``` Wake word enrollment succeeded event. Handler called when an enrollment passes quality checks. Return [SKIP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_skip) from this event handler to discard an enrollment. **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [enrollment-id](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#enrollment-id), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) ### ^pause - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_PAUSE_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.PAUSE_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.PAUSE_EVENT, on_event) ``` Input stream pause event. Handler called when a time-consuming processing step is about to start. Use this handler to pause the input stream when doing interactive enrollment. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^resume](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#resume) ### ^progress - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_PROG_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.PROG_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.PROG_EVENT, on_event) ``` Wake word adaptation progress event. Handler called to report adaptation progress. **Available results:** [percent-done](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#percent-done), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) ### ^resume - event - handle - write-only **C/C++** ```c snsrSetHandler(s, SNSR_RESUME_EVENT, snsrCallback(on_event, NULL, NULL)); ``` **Java** ```java s.setHandler(Snsr.RESUME_EVENT, (session, key) -> { /* ... */ }); ``` **Python** ```python s.set_handler(snsr.RESUME_EVENT, on_event) ``` Input stream resume event. Handler called when a time-consuming processing step has completed. Use this handler to restart an input stream that was stopped during a [^pause](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pause) event handler. **Available results:** [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) **Available iterators:** _none_ **Also see these related items:** [^pause](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pause) *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SDK]: Software Development Kit *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/setting-keys/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/" --- # Setting keys Settings are strings used as keys in [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) object function or method calls. There are different categories of keys, defined by how they are used. [Configuration](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#configuration), [Runtime](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#runtime), [Results](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#results), [Events](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events), and [Iterators](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#iterators) are organized by **concern** (for example, **Audio I/O** or **Wake word & command set**). Within each group, entries are listed alphabetically. A setting that applies to more than one concern is documented once under its primary group; other groups link to that entry. [Values](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#values) and [Library information](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-information) are short, flat lists and are not grouped this way. [Events](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) - Use [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) with event keys to register callback handlers for events. [Iterators](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#iterators) - Use with [forEach](https://doc.sensory.com/tnl/7.8/api/inference.md#foreach) to loop over lists of values. [Results](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#results) - Report the results of model [inference](https://doc.sensory.com/tnl/7.8/api/inference.md#inference). [Runtime](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#runtime) - Inspect or alter the runtime state of [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) instances. [Configuration](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#configuration) - Change or fine-tune model behavior. Values for these keys persist in model files. You can modify these settings with the `-s` option in the command-line utilities. [Values](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#values) - String constants that define [task-types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) used with [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), or that identify template [slots](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot). [Library information](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-information) - Information about the library, rather than any specific model. **Concern groups (Configuration, Runtime, Results, Events, Iterators):** The same group names appear on every grouped page; a page omits groups that have no entries. Groups used in this reference: - **Audio I/O** — streams and the audio history buffer - **VAD & endpointing** — voice activity detection and silence handling - **Wake word & command set** — operating point, listen window, thresholds - **LVCSR & STT** — decoder tuning, grammar streams, STT vocabulary, transcription profile - **Alignment & timing** — sample/millisecond offsets ([Results](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#results) only) - **NLU & SLM** — intents, entities, and SLM controls - **Templates & slots** — slot identifiers and template loop/include behavior - **Enrollment & adaptation** — enrollment flow and adaptation - **Push / pull execution** — push buffers, duration limits, flush behavior - **THF Micro DSP** — export streams, target port, and conversion results for [THF Micro][] embedded deployments - **Logging & diagnostics** — debug logging and profiling - **Identity & metadata** — task/model identity and licensing [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/setting-keys/iterators.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators/" --- # Iterators Use these keys with [forEach](https://doc.sensory.com/tnl/7.8/api/inference.md#foreach) to loop over lists of values. The values for these settings refer to runtime instances of code objects and are not serialized by [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) or [dup](https://doc.sensory.com/tnl/7.8/api/inference.md#dup). The [• results](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#results) and [• iterators](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#iterators) available in the loop functions vary by iterator. See the descriptions below for details on what these are. Iterators are available only in the [• events](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) listed, except for those marked as • _all_, which you can use outside of event callbacks too. Most applications don't use iterators directly — recognition results arrive via [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) without iteration. The two most useful when you do reach for them are [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) (per-word alignments and scores) and [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) (model introspection); see the descriptions below for the rest. ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## Wake word & command set ### model-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_MODEL_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.MODEL_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.MODEL_LIST, on_item) ``` Iterate over the phoneme parts in a result. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ **Also see these related items:** [^phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ### operating-point-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_OPERATING_POINT_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.OPERATING_POINT_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.OPERATING_POINT_LIST, on_item) ``` Iterate over available spotter operating points. **Available in these events:** _all_ **Available results:** [available-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#available-point) **Available iterators:** _none_ ### phone-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_PHONE_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.PHONE_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.PHONE_LIST, on_item) ``` Iterate over the phonemes in a result. STT models do not report any sub-word unit hypotheses. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ **Also see these related items:** [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ### vocab-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_VOCAB_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.VOCAB_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.VOCAB_LIST, on_item) ``` Iterate over the vocabulary available in a keyword spotter. **Available in these events:** _all_ **Available results:** [id](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#id), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ ### word-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_WORD_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.WORD_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.WORD_LIST, on_item) ``` Iterate over the words in a result. User-defined enrolled results are modeled as a single word. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ ## LVCSR & STT ### model-iterator - iterator - handle - write-only Documented under **Wake word & command set** ([model-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#model-iterator)). ### phone-iterator - iterator - handle - write-only Documented under **Wake word & command set** ([phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator)). ### phrase-iterator - iterator - handle - write-only - TrulyNatural only **C/C++** ```c snsrForEach(s, SNSR_PHRASE_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.PHRASE_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.PHRASE_LIST, on_item) ``` Iterate over recognition phrase hypotheses. Most recognizers provide only the top-scoring recognition result. `phrase-iterator` is useful only when an LVCSR recognizer is configured to provide N-best results; when [result-max](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#result-max) `> 1`. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ **Also see these related items:** [result-max](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#result-max) ### word-iterator - iterator - handle - write-only Documented under **Wake word & command set** ([word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator)). ## NLU & SLM ### nlu-entity-iterator - iterator - handle - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrForEach(s, SNSR_NLU_ENTITY_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.NLU_ENTITY_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.NLU_ENTITY_LIST, on_item) ``` Iterate over NLU entities in the current intent. This iterator invokes the specified handler for each of the [nlu-entity-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-count) entities found for the current intent. **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available results:** [nlu-entity-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-count), [nlu-entity-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name), [nlu-entity-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-score), [nlu-entity-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-value) **Available iterators:** _none_ **Also see these related items:** [nlu-entity-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-count) ### nlu-slot-iterator - iterator - handle - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrForEach(s, SNSR_NLU_SLOT_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.NLU_SLOT_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.NLU_SLOT_LIST, on_item) ``` Iterate over NLU child result slots. This iterator invokes the specified handler for each of the NLU result slots found in the current NLU slot value, [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value). **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available results:** [nlu-slot-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-count), [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name), [nlu-slot-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-score), [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) **Available iterators:** [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) ### nlu-word-iterator - iterator - handle - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrForEach(s, SNSR_NLU_WORD_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.NLU_WORD_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.NLU_WORD_LIST, on_item) ``` Iterate over the words in an NLU result. Valid only for LVCSR recognizers that include NLU post-processing. This post-processing can insert, delete, or change words in the recognition result and those changes are available here. **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available results:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) **Available iterators:** _none_ **Also see these related items:** [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ## Enrollment & adaptation ### enrollment-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_ENROLLMENT_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.ENROLLMENT_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.ENROLLMENT_LIST, on_item) ``` Iterate over all wake word enrollments for the current user. This can be used to retrieve enrollment audio if [save-enroll-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#save-enroll-audio) is enabled. **Available in these events:** _all_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample), [enrollment-id](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#enrollment-id), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) **Available iterators:** _none_ **Also see these related items:** [save-enroll-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#save-enroll-audio), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator) ### reason-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_REASON_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.REASON_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.REASON_LIST, on_item) ``` Iterate over all reasons for wake word enrollment failure. **Available in these events:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) **Available results:** [reason](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason), [reason-guidance](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-guidance), [reason-pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-pass), [reason-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-threshold), [reason-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-value) **Available iterators:** _none_ ### user-iterator - iterator - handle - write-only **C/C++** ```c snsrForEach(s, SNSR_USER_LIST, snsrCallback(on_item, NULL, NULL)); ``` **Java** ```java s.forEach(Snsr.USER_LIST, (session, key) -> { /* ... */ }); ``` **Python** ```python s.for_each(snsr.USER_LIST, on_item) ``` Iterate over all enrolled users. Sets [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) to each of the enrolled users before invoking the callback. **Available in these events:** _all_ **Available results:** [enrollment-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#enrollment-count), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) **Available iterators:** _none_ *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/setting-keys/library-information.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information/" --- # Library information These keys return information about the library. They require a valid [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) but are available before loading a task model. Read information keys with the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) `get` function that matches the type of the setting. For example, use [getString](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) to retrieve the _(string)_ value of [oss-components](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#oss-components). ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## accel-info - information - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_ACCELERATION, &value); ``` **Java** ```java String value = s.getString(Snsr.ACCELERATION); ``` **Python** ```python value = s.get_string(snsr.ACCELERATION) ``` Type of acceleration used. Returns text describing the type of vector acceleration the library uses. **Also see these related items:** [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info) ## library-info - information - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_LIBRARY_INFO, &value); ``` **Java** ```java String value = s.getString(Snsr.LIBRARY_INFO); ``` **Python** ```python value = s.get_string(snsr.LIBRARY_INFO) ``` SDK library information. Human-readable summary of [NAME](https://doc.sensory.com/tnl/7.8/api/constants.md#name), [VERSION](https://doc.sensory.com/tnl/7.8/api/constants.md#version), [VERSION_DSP](https://doc.sensory.com/tnl/7.8/api/constants.md#version_dsp), [accel-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#accel-info), [license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-message), [license-exp-warn](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-warn), [stt-support](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#stt-support), [thread-support](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#thread-support), [LICENSE_SUPPORT](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license_support), and [oss-components](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#oss-components). ## license-exp-date - information - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_LICENSE_EXPDATE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.LICENSE_EXPDATE); ``` **Python** ```python value = s.get_double(snsr.LICENSE_EXPDATE) ``` Library license expiration date. Returns the license expiration date of the TrulyNatural library in seconds since the [epoch][]. For production keys, which do not expire, the expiration date is `0`. ## license-exp-message - information - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_LICENSE_EXPIRES, &value); ``` **Java** ```java String value = s.getString(Snsr.LICENSE_EXPIRES); ``` **Python** ```python value = s.get_string(snsr.LICENSE_EXPIRES) ``` Library license expiration message. The returned string is of the form `"Library license expires on "`, or `NULL` for license keys that do not expire. **Also see these related items:** [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info) ## license-exp-warn - information - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_LICENSE_WARNING, &value); ``` **Java** ```java String value = s.getString(Snsr.LICENSE_WARNING); ``` **Python** ```python value = s.get_string(snsr.LICENSE_WARNING) ``` Library license expiration warning message. This value is `NULL` for library license keys that do not expire, or if the expiration date is more than 60 days into the future. For license keys expiring in 60 days or fewer, the returned string will be of the form `"License will expire in 37 days."`. **Also see these related items:** [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info) ## oss-components - information - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_OSS_COMPONENTS, &value); ``` **Java** ```java String value = s.getString(Snsr.OSS_COMPONENTS); ``` **Python** ```python value = s.get_string(snsr.OSS_COMPONENTS) ``` Open Source library acknowledgements. List of Open Source third-party modules linked into the binary. The returned string lists one module per newline-separated line. Each line contains the name of the library, the license type and a URL to the module license text, separated by tab characters. The TrulyNatural SDK uses [permissive software licenses][oss-permissive] only. If the binary uses no Open Source modules, the returned string is `NULL`. **Also see these related items:** [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info), [Open Source Licenses](https://doc.sensory.com/tnl/7.8/licenses/oss.md#open-source-licenses) ## stt-support - information - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_STT_SUPPORT, &value); ``` **Java** ```java int value = s.getInt(Snsr.STT_SUPPORT); ``` **Python** ```python value = s.get_int(snsr.STT_SUPPORT) ``` Library Speech-To-Text support. Returns `1` if the TrulyNatural SDK supports STT models, or `0 `if STT support is not available. This is functionally equivalent to calling [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) with [CONFIG_STT_SUPPORT](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_stt_support), but is available across all API language bindings. **Also see these related items:** [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info), [CONFIG_STT_SUPPORT](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_stt_support) ## thread-support - information - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_THREAD_SUPPORT, &value); ``` **Java** ```java int value = s.getInt(Snsr.THREAD_SUPPORT); ``` **Python** ```python value = s.get_int(snsr.THREAD_SUPPORT) ``` Library multithreading support. Returns `1` if the TrulyNatural SDK supports running multi-threaded models, or `0` if thread support is not available. This is functionally equivalent to calling [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) with [CONFIG_THREAD_SUPPORT](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_thread_support), but is available across all API language bindings. **Also see these related items:** [library-info](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#library-info), [CONFIG_THREAD_SUPPORT](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_thread_support) [epoch]: https://en.wikipedia.org/wiki/Unix_time "Unix time" [oss-permissive]: https://en.wikipedia.org/wiki/Permissive_software_license "Grants use rights, forbids almost nothing" *[API]: Application Programming Interface *[OSS]: Open-source software *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "api/setting-keys/results.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/results/" --- # Results These are read-only settings that report the results of model [inference](https://doc.sensory.com/tnl/7.8/api/inference.md#inference). The exact meaning depends on the context in which they are read. [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), for example, refers to the start time of a word in [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator), but to the onset of speech detection in [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end). Most results are valid only for the duration of the [• event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) or [• iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#iterators) [callback](https://doc.sensory.com/tnl/7.8/api/inference.md#callback) handler where they are available in. If a result is shown as begin available in • _all_ events, you can also read it outside of event callbacks. Read results with the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) `get` function that matches the type of the setting. For example, use [getDouble](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) to retrieve _(double)_ values. The handful of results most applications read: | Name | Type | Available in | Summary | |------|------|--------------|---------| | [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) | string | `^result`, `^result-partial`, `^slm-result*`; word and phrase iterators | Phrase, word, or phoneme hypothesis. | | [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | double | `^result`, `^result-partial`; word and phrase iterators | How well audio matches the recognizer; threshold for spotters. | | [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms) / [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms) | double | `^result`, `^result-partial`, `^begin`, `^end`, `^limit`; alignment iterators | Audio start / endpoint timestamps. | | [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) | string | `^nlu-intent` | Top-level NLU intent. | | [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value) | string | `^nlu-intent` | Captured value for the current NLU intent. | Everything else on this page is either context-specific or rarely needed. ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## Audio I/O ### audio-stream - result - output stream - read-only **C/C++** ```c SnsrStream stream; snsrGetStream(s, SNSR_AUDIO_STREAM, &stream); ``` **Java** ```java SnsrStream stream = s.getStream(Snsr.AUDIO_STREAM); ``` **Python** ```python stream = s.get_stream(snsr.AUDIO_STREAM) ``` Segmented audio data stream. * For enrollment tasks with [save-enroll-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#save-enroll-audio) is set to `1` (on) this is the enrollment recording. If [save-enroll-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#save-enroll-audio) is `0` (off), audio will only be available in the [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) event. * For recognition tasks, the samples from [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from) to [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to) selected from the last [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size) samples processed by the recognizer. The default [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size) `0`, which disables audio buffering and will cause `audio-stream` retrieval to fail. Be sure to set [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size) to the expected number of samples before calling [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) or [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run). **Available in these events:** _all_ **Available in these iterators:** _all_ **Also see these related items:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) ### audio-stream-first - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_AUDIO_STREAM_FIRST, &value); ``` **Java** ```java double value = s.getDouble(Snsr.AUDIO_STREAM_FIRST); ``` **Python** ```python value = s.get_double(snsr.AUDIO_STREAM_FIRST) ``` Audio buffer start sample index. The index of the first (oldest) audio sample contained in the [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream). **Available in these events:** _all_ **Available in these iterators:** _all_ **Also see these related items:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) ### audio-stream-last - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_AUDIO_STREAM_LAST, &value); ``` **Java** ```java double value = s.getDouble(Snsr.AUDIO_STREAM_LAST); ``` **Python** ```python value = s.get_double(snsr.AUDIO_STREAM_LAST) ``` The index of the last (most recent) audio sample contained in the [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream). **Available in these events:** _all_ **Available in these iterators:** _all_ **Also see these related items:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first) ### frame-count - configuration - double - read-only - pre-release **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_FRAMES, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_FRAMES); ``` **Python** ```python value = s.get_double(snsr.RES_FRAMES) ``` Number of feature frames read from the input stream. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. **Available in these events:** _all_ **Available in these iterators:** _all_ ### sample-count - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SAMPLES, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SAMPLES); ``` **Python** ```python value = s.get_double(snsr.RES_SAMPLES) ``` Number of samples read from the input stream. **Available in these events:** _all_ **Available in these iterators:** _all_ **Also see these related items:** [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) ## Wake word & command set ### available-point - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_AVAILABLE_POINT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_AVAILABLE_POINT); ``` **Python** ```python value = s.get_int(snsr.RES_AVAILABLE_POINT) ``` Available operating point. **Available in these events:** _none_ **Available in these iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator) **Also see these related items:** [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) ### confidence-score - result - double - read-only - deprecated [6.14.0](https://doc.sensory.com/tnl/7.8/changes/version-6.md#v6.14.0) **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_CONF_SCORE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_CONF_SCORE); ``` **Python** ```python value = s.get_double(snsr.RES_CONF_SCORE) ``` Fixed-phrase wake word confidence score. **Deprecated:** Confidence score support will be removed from the next major release of this SDK. **Do not use this in new code.** The probability of the spotted phrase being a true accept. **This is a model-dependent optional feature not universally supported.** It is not supported by enrolled models, use [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) instead. The reported range is `0` to `1`, or `< 0` if not supported by the spotter model. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** _none_ **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) ### id - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_ID, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_ID); ``` **Python** ```python value = s.get_int(snsr.RES_ID) ``` Recognition ID result. Unique wake word phrase result ID, compatible with [THF Micro][]. For most single-phrase spotters this will be `1`. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Also see these related items:** [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) ### noise-energy - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_NOISE_ENERGY, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_NOISE_ENERGY); ``` **Python** ```python value = s.get_double(snsr.RES_NOISE_ENERGY) ``` Noise energy. The energy (in dB relative to `1.0`) in the background audio preceding the wake word audio. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** _none_ **Also see these related items:** [signal-energy](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#signal-energy), [snr](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#snr) ### score - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SCORE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SCORE); ``` **Python** ```python value = s.get_double(snsr.RES_SCORE) ``` Recognition score. A value between `0` and `1` that indicates how well the acoustic evidence matches the recognizer's expectations. In phrase spotters that report this score, the operating point is set by thresholding this value. **Note:** `score` is not supported by all recognizer types. For older models, [getDouble](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) will report an [SETTING_NOT_FOUND](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) error. Recent models that do not support scoring report [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) and a score value of `-1.0` **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available in these iterators:** [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ### signal-energy - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SIGNAL_ENERGY, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SIGNAL_ENERGY); ``` **Python** ```python value = s.get_double(snsr.RES_SIGNAL_ENERGY) ``` Signal energy. The energy (in dB relative to `1.0`) in the spotted phrase. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** _none_ **Also see these related items:** [noise-energy](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#noise-energy), [snr](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#snr) ### snr - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SNR, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SNR); ``` **Python** ```python value = s.get_double(snsr.RES_SNR) ``` Signal to noise ratio. The ratio of the wake word signal energy to the noise energy, in dB. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** _none_ **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [noise-energy](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#noise-energy), [signal-energy](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#signal-energy) ### sv-score - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SV_SCORE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SV_SCORE); ``` **Python** ```python value = s.get_double(snsr.RES_SV_SCORE) ``` Speaker verification score. The confidence that the spotted phrase was spoken by the enrolled speaker, in the range `0` to `1`. For non-enrolled spotters the confidence is always `1`. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** _none_ **Also see these related items:** [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold) ### text - result - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_TEXT, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_TEXT); ``` **Python** ```python value = s.get_string(snsr.RES_TEXT) ``` Recognition text result. The phrase, word, or phoneme hypothesis from a wake word, LVCSR, or STT recognizer. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) **Available in these iterators:** [model-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#model-iterator), [nlu-word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-word-iterator), [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ## LVCSR & STT ### result-count - result - int - read-only - TrulyNatural only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_COUNT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_COUNT); ``` **Python** ```python value = s.get_int(snsr.RES_COUNT) ``` Recognition result count. The total number of items available in the current list iteration. **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ### result-index - result - int - read-only - TrulyNatural only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_INDEX, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_INDEX); ``` **Python** ```python value = s.get_int(snsr.RES_INDEX) ``` Recognition result index. The index of the item under consideration in the current list iteration. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) **Available in these iterators:** [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) ### score - result - double - read-only Documented under **Wake word & command set** ([score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score)). ### text - result - string - read-only Documented under **Wake word & command set** ([text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text)). ## Alignment & timing ### begin-ms - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_BEGIN_MS, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_BEGIN_MS); ``` **Python** ```python value = s.get_double(snsr.RES_BEGIN_MS) ``` Timestamp of the audio start point. The offset in ms from the beginning of the audio stream where: * the recognition unit started in [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) or [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), or * the VAD first detected speech in [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), or [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit). **Available in these events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available in these iterators:** [model-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#model-iterator), [nlu-word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-word-iterator), [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) **Also see these related items:** [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample) ### begin-sample - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_BEGIN_SAMPLE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_BEGIN_SAMPLE); ``` **Python** ```python value = s.get_double(snsr.RES_BEGIN_SAMPLE) ``` Sample index of the audio start point. The offset in samples from the beginning of the audio stream where: * the recognition unit started in [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) or [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), or * the VAD first detected speech in [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), or [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit). **Available in these events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available in these iterators:** [model-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#model-iterator), [nlu-word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-word-iterator), [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) **Also see these related items:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample) ### end-ms - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_END_MS, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_END_MS); ``` **Python** ```python value = s.get_double(snsr.RES_END_MS) ``` Timestamp of the audio endpoint. The offset in ms from the beginning of the audio stream: * where the recognition unit ended in [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), or * the VAD last detected speech in [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) or [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit). **Available in these events:** [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available in these iterators:** [model-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#model-iterator), [nlu-word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-word-iterator), [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) **Also see these related items:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-sample) ### end-sample - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_END_SAMPLE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_END_SAMPLE); ``` **Python** ```python value = s.get_double(snsr.RES_END_SAMPLE) ``` Sample index of the audio endpoint. The offset in samples from the beginning of the audio stream: * where the recognition unit ended in [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), or * the VAD last detected speech in [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) or [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit). **Available in these events:** [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available in these iterators:** [model-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#model-iterator), [nlu-word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-word-iterator), [phone-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phone-iterator), [phrase-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#phrase-iterator), [word-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#word-iterator) **Also see these related items:** [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms), [begin-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-sample), [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms) ## NLU & SLM ### domain - result - string - read-only _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_DOMAIN, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_DOMAIN); ``` **Python** ```python value = s.get_string(snsr.RES_DOMAIN) ``` STT recognition domain. This is short label identifying the domain identified by the STT recognizer, for example: `automotive` or `numbers`. This value is `NULL` if the recognizer does not support multiple domains. **Available in these events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) **Available in these iterators:** _none_ **Also see these related items:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) ### nlu-entity-count - result - int - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_NLU_ENTITY_COUNT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_NLU_ENTITY_COUNT); ``` **Python** ```python value = s.get_int(snsr.RES_NLU_ENTITY_COUNT) ``` Number of NLU entities available. Reports the number of entities the current [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value) contains. An entity is typically an object that an intent action operates on. **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available in these iterators:** [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) **Also see these related items:** [nlu-entity-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name), [nlu-entity-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-score), [nlu-entity-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-value), [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value) ### nlu-entity-name - result - string - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_ENTITY_NAME, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_ENTITY_NAME); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_ENTITY_NAME) ``` Name of the current entity. Valid only in [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) callback handlers. **Available in these events:** _none_ **Available in these iterators:** [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) **Also see these related items:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [nlu-entity-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-score), [nlu-entity-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-value) ### nlu-entity-score. - result - string - read-only - STT only **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_ENTITY_SCORE, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_ENTITY_SCORE); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_ENTITY_SCORE) ``` Score of the current entity. Reports the confidence the model has that this entity was classified correctly. Not all NLU models report scores. If the score is not available it is reported as `0`. If you know the name of the entity, you can retrieve the value directly without having to use [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) by appending the name to `nlu-entity-score.` **Example:** **C/C++** ```c double score; snsrGetDouble(session, SNSR_RES_NLU_ENTITY_SCORE "alarm.time", &score); ``` **Java** ```java double score = session.getDouble(Snsr.RES_NLU_ENTITY_SCORE + "alarm.time"); ``` **Python** ```python score = session.get_double(snsr.RES_NLU_ENTITY_SCORE + "alarm.time") ``` **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available in these iterators:** [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) **Also see these related items:** [nlu-entity-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name), [nlu-entity-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-value) ### nlu-entity-value. - result - string - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_ENTITY_VALUE, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_ENTITY_VALUE); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_ENTITY_VALUE) ``` Captured value of the current entity. If you know the name of the entity, you can retrieve the value directly without having to use [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) by appending the name to `nlu-entity-value.` **Example:** **C/C++** ```c char *atime; snsrGetString(session, SNSR_RES_NLU_ENTITY_VALUE "alarm.time", &atime); ``` **Java** ```java String atime = session.getString(Snsr.RES_NLU_ENTITY_VALUE + "alarm.time"); ``` **Python** ```python atime = session.get_string(snsr.RES_NLU_ENTITY_VALUE + "alarm.time") ``` **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available in these iterators:** [nlu-entity-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-entity-iterator) **Also see these related items:** [nlu-entity-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name), [nlu-entity-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-score) ### nlu-intent-name - result - string - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_INTENT_NAME, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_INTENT_NAME); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_INTENT_NAME) ``` The name of the current NLU intent. **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available in these iterators:** _none_ **Also see these related items:** [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value), [nlu-intent-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-score) ### nlu-intent-score - result - double - read-only - STT only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_NLU_INTENT_SCORE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_NLU_INTENT_SCORE); ``` **Python** ```python value = s.get_double(snsr.RES_NLU_INTENT_SCORE) ``` Score of the current NLU intent. Reports the confidence the model has that the intent was classified correctly. Not all NLU models report scores. If the score is not available it is reported as `0`. **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available in these iterators:** _none_ **Also see these related items:** [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name), [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value) ### nlu-intent-value - result - string - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_INTENT_VALUE, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_INTENT_VALUE); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_INTENT_VALUE) ``` Captured value of the current NLU intent. This is the part of the recognition result classified as the current intent. **Available in these events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) **Available in these iterators:** _none_ **Also see these related items:** [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name), [nlu-intent-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-score) ### nlu-match-count - result - int - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_NLU_COUNT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_NLU_COUNT); ``` **Python** ```python value = s.get_int(snsr.RES_NLU_COUNT) ``` Number of NLU result matches available. Reports the number of NLU matches that are available for this result. The available matches are capped by [nlu-match-max](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#nlu-match-max). Multiple matches are only possible when there's ambiguity in the NLU grammar: One input sequence matches multiple output sequences, or when the `.*` match-any-word operator results in multiple valid segmentations. **Example:** ``` g = ~~{first .*} {second .*}~~ ; ``` **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available in these iterators:** _none_ **Also see these related items:** [nlu-match-index](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-match-index), [nlu-match-max](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#nlu-match-max) ### nlu-match-index - result - int - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_NLU_INDEX, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_NLU_INDEX); ``` **Python** ```python value = s.get_int(snsr.RES_NLU_INDEX) ``` The current NLU match index. Reports the current NLU match for [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot). The best-scoring match will have `nlu-match-index == 0`. **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available in these iterators:** _none_ **Also see these related items:** [nlu-match-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-match-count), [nlu-match-max](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#nlu-match-max) ### nlu-slot-count - result - int - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_NLU_SLOT_COUNT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_NLU_SLOT_COUNT); ``` **Python** ```python value = s.get_int(snsr.RES_NLU_SLOT_COUNT) ``` Number of NLU child slots available. Reports the number of child slots the current [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) contains. For final value slots this count is `0`. **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available in these iterators:** [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) **Also see these related items:** [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name), [nlu-slot-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-score), [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) ### nlu-slot-name - result - string - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_SLOT_NAME, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_SLOT_NAME); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_SLOT_NAME) ``` Name of the current NLU slot. **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available in these iterators:** [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) **Also see these related items:** [nlu-slot-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-score), [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) ### nlu-slot-score. - result - double - read-only - STT only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_NLU_SLOT_SCORE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_NLU_SLOT_SCORE); ``` **Python** ```python value = s.get_double(snsr.RES_NLU_SLOT_SCORE) ``` Score of the current NLU slot. Reports the confidence the model has that this slot was classified correctly. Not all NLU models report scores. If the score is not available it is reported as `0`. If you know the name of the (possibly nested) slot, you can retrieve the value directly without having to use [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator). Separate slot names in the hierarchy with a period. **Example:** **C/C++** ```c double score; snsrGetDouble(session, SNSR_RES_NLU_SLOT_SCORE "alarm.time", &score); ``` **Java** ```java double score = session.getDouble(Snsr.RES_NLU_SLOT_SCORE + "alarm.time"); ``` **Python** ```python score = session.get_double(snsr.RES_NLU_SLOT_SCORE + "alarm.time") ``` **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available in these iterators:** [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) **Also see these related items:** [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name), [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) ### nlu-slot-value. - result - string - read-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_NLU_SLOT_VALUE, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_NLU_SLOT_VALUE); ``` **Python** ```python value = s.get_string(snsr.RES_NLU_SLOT_VALUE) ``` Captured value of the current NLU slot. Use `nlu-slot-value` to retrieve the string value of the current NLU slot. If you know the name of the (possibly nested) slot, you can retrieve the value directly without having to use [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) Separate slot names in the hierarchy with a period. **Example:** With this grammar: ``` ampm = am | pm; time = 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12; alarm = set the alarm for {time} {ampm}; grammar = ~~{alarm}~~ ; ``` Use this code snippet to retrieve the time and am/pm slot values in the [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) callback: **C/C++** ```c const char *timeHours; const char *ampm; snsrGetString(session, SNSR_RES_NLU_SLOT_VALUE "alarm.time", &timeHours); snsrGetString(session, SNSR_RES_NLU_SLOT_VALUE "alarm.ampm", &m); ``` **Java** ```java String timeHours = session.getString(Snsr.RES_NLU_SLOT_VALUE + "alarm.time"); String ampm = session.getString(Snsr.RES_NLU_SLOT_VALUE + "alarm.ampm"); ``` **Python** ```python time_hours = session.get_string(snsr.RES_NLU_SLOT_VALUE + "alarm.time") ampm = session.get_string(snsr.RES_NLU_SLOT_VALUE + "alarm.ampm") ``` **Available in these events:** [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) **Available in these iterators:** [nlu-slot-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#nlu-slot-iterator) **Also see these related items:** [nlu-slot-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-count), [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name) ### text - result - string - read-only Documented under **Wake word & command set** ([text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text)). SLM results are reported via this key; NLU intents use [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) and [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value) instead. ## Enrollment & adaptation ### enrollment-count - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_ENROLLMENT_COUNT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_ENROLLMENT_COUNT); ``` **Python** ```python value = s.get_int(snsr.RES_ENROLLMENT_COUNT) ``` Enrollment count. The number of enrollments accumulated for the enrolled user. **Available in these events:** _none_ **Available in these iterators:** [enrollment-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#enrollment-iterator), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator) **Also see these related items:** [req-enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#req-enroll), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator) ### enrollment-id - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_ENROLLMENT_ID, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_ENROLLMENT_ID); ``` **Python** ```python value = s.get_int(snsr.RES_ENROLLMENT_ID) ``` Enrollment ID. A unique ID for the current user's current enrollment. **Available in these events:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail), [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass) **Available in these iterators:** [enrollment-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#enrollment-iterator) ### model-stream - result - output stream - read-only **C/C++** ```c SnsrStream stream; snsrGetStream(s, SNSR_MODEL_STREAM, &stream); ``` **Java** ```java SnsrStream stream = s.getStream(Snsr.MODEL_STREAM); ``` **Python** ```python stream = s.get_stream(snsr.MODEL_STREAM) ``` Enrolled wake word model stream. The result after enrollment and adaptation. This is a model that will recognize the enrolled phrases. Save to permanent storage with [copy](https://doc.sensory.com/tnl/7.8/api/io.md#stream-copy). Retrieving the model stream will fail with [SETTING_NOT_AVAILABLE](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) if there are no enrolled users. **Available in these events:** [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done) **Available in these iterators:** _none_ ### percent-done - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_PERCENT_DONE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_PERCENT_DONE); ``` **Python** ```python value = s.get_double(snsr.RES_PERCENT_DONE) ``` Enrollment progress. A value between `0` and `100` that is an estimate of the enrollment task completion progress. **Available in these events:** [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress) **Available in these iterators:** _none_ **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled) ### reason - result - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_REASON, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_REASON); ``` **Python** ```python value = s.get_string(snsr.RES_REASON) ``` Reason for enrollment failure. Provides a shorthand indication of why a wake word enrollment was rejected. Reason | Guidance -------|--------- `energy-min` | Please speak louder. `energy-stddev` | This recording does not sound like speech. `silence-begin` | Please wait for the prompt before speaking. `silence-end` | The trailing silence too short. `snr` | The recording is too noisy. Please move to a quieter environment. `rec-variance` | The difference between the recordings is too large. Please repeat the exact same phrase. `poor-rec-limit` | The recording may not contain speech. Please speak a consistent trigger. `clipping` | The recording is clipped, please reduce the volume. `vowel-duration` | Please speak more slowly, or choose a different phrase with more vowel sounds. `repetition` | This phrase has too many repeated sounds. Please choose another. `silence-in-phrase` | Please don't pause - even briefly - in the middle of the recording. `spot` | Please say the exact enrollment phrase, speaking clearly and naturally. `phrase-quality` | This phrase is not suitable, please choose another or speak a little more slowly. `audio-quality` | The enrollment shows signs of problems with the audio hardware. `audio-duration` | The enrollment recording is too short. `audio-volume` | No audio detected. Please speak louder. `audio-failure` | _varies_ *All `reason` values and corresponding guidance* **Available in these events:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) **Available in these iterators:** [reason-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#reason-iterator) **Also see these related items:** [reason-guidance](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-guidance) ### reason-guidance - result - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_GUIDANCE, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_GUIDANCE); ``` **Python** ```python value = s.get_string(snsr.RES_GUIDANCE) ``` End-user guidance to correct a wake word enrollment failure. Provides a human-readable string (in English) with a suggestion on how to correct an enrollment failure. **Available in these events:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) **Available in these iterators:** [reason-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#reason-iterator) **Also see these related items:** [reason](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason) ### reason-pass - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_REASON_PASS, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_REASON_PASS); ``` **Python** ```python value = s.get_int(snsr.RES_REASON_PASS) ``` Enrollment success. `1` if the enrollment passed, `0` if it was rejected. ### reason-threshold - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_REASON_THRESHOLD, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_REASON_THRESHOLD); ``` **Python** ```python value = s.get_double(snsr.RES_REASON_THRESHOLD) ``` Enrollment check threshold value. **Available in these events:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) **Available in these iterators:** [reason-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#reason-iterator) **Also see these related items:** [reason-pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-pass), [reason-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-value) ### reason-value - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_REASON_VALUE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_REASON_VALUE); ``` **Python** ```python value = s.get_double(snsr.RES_REASON_VALUE) ``` Enrollment check failure value. The value of an enrollment check parameter. This is compared to [reason-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-threshold) to determine [reason-pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-pass). **Available in these events:** [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) **Available in these iterators:** [reason-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#reason-iterator) **Also see these related items:** [reason-pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-pass), [reason-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#reason-threshold) ### user-count - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_USER_COUNT, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_USER_COUNT); ``` **Python** ```python value = s.get_int(snsr.RES_USER_COUNT) ``` Enrolled user count. The number of distinct enrolled users. This setting is only available for [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) models that continuously adapt to speakers' voices. **Available in these events:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user) **Available in these iterators:** _none_ ### user-index - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_USER_INDEX, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_USER_INDEX); ``` **Python** ```python value = s.get_int(snsr.RES_USER_INDEX) ``` Enrolled user index. The index of the item under consideration in the current user list iteration. **Available in these events:** _none_ **Available in these iterators:** [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator) ## THF Micro DSP ### dsp.production-ready - result - int - read-only **C/C++** ```c int value; snsrGetInt(s, SNSR_RES_EMBEDDED_MODEL_PRODUCTION_READY, &value); ``` **Java** ```java int value = s.getInt(Snsr.RES_EMBEDDED_MODEL_PRODUCTION_READY); ``` **Python** ```python value = s.get_int(snsr.RES_EMBEDDED_MODEL_PRODUCTION_READY) ``` Whether the DSP model files are suitable for production use. Possible values: `0`: The [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream) has enforced event limits. The model will stop working after a pre-determined number of recognition events, or audio samples processed. *This model is not suitable for production use.* `1`: The [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream) is not limited, and can be used in products. **Note:** This read-only value is valid only after a [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream) conversion is complete. **Available in these events:** _all_ **Available in these iterators:** _all_ **Also see these related items:** [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp.t-slice-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspt-slice-version) ### dsp.t-slice-version - result - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_RES_MIN_EMBEDDED_VERSION, &value); ``` **Java** ```java String value = s.getString(Snsr.RES_MIN_EMBEDDED_VERSION); ``` **Python** ```python value = s.get_string(snsr.RES_MIN_EMBEDDED_VERSION) ``` Embedded port version. This is the minimum version of the embedded port (also known as the t-slice version) required to run the [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream) and [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream) DSP data files. **Note:** This read-only value is valid only after a [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream) conversion is complete. **Available in these events:** _all_ **Available in these iterators:** _all_ **Also see these related items:** [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp.production-ready](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspproduction-ready), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target) ## Logging & diagnostics ### profile:real-time - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_CPU_SECONDS_USED, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_CPU_SECONDS_USED); ``` **Python** ```python value = s.get_double(snsr.RES_CPU_SECONDS_USED) ``` Seconds spent in model inference since last reset. Reports the number of wall-clock seconds spent running the model, or [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) slot in a model. This depends on having a usable real-time clock implementation. **Available in these events:** _none_ **Available in these iterators:** _none_ **Also see these related items:** [profile](https://doc.sensory.com/tnl/7.8/api/inference.md#profile), [CONFIG_CLOCK_FUNC](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_clock_func) ### profile:samples - result - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_RES_SAMPLES_PROCESSED, &value); ``` **Java** ```java double value = s.getDouble(Snsr.RES_SAMPLES_PROCESSED); ``` **Python** ```python value = s.get_double(snsr.RES_SAMPLES_PROCESSED) ``` Number of samples processed since last reset. Reports the number of audio samples processed while running a model, or [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) slot in a model. When used without a slot prefix `profile:samples` returns the same value as [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count). **Available in these events:** _none_ **Available in these iterators:** _none_ **Also see these related items:** [profile](https://doc.sensory.com/tnl/7.8/api/inference.md#profile), [profile:real-time](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#profilereal-time), [sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sample-count) [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SDK]: Software Development Kit *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/setting-keys/runtime.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime/" --- # Runtime These settings inspect the state, or change the behavior, of [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) object instances. They are not serialized by [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) or [dup](https://doc.sensory.com/tnl/7.8/api/inference.md#dup). Most applications will use [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm) with [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) or [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) to present audio samples to a recognition task. Other runtime keys come up only in specific workflows: [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream) and [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream) for custom LVCSR vocabularies, [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target) for [THF Micro][] export, and [add-context](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#add-context) / [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user) / [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user) for enrollment. Access runtime settings with the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) `get` or `set` function that matches the setting type. For example, use [setStream](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) to change the _(input stream)_ for [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## Audio I/O ### ->audio-pcm - runtime - input stream - write-only **C/C++** ```c snsrSetStream(s, SNSR_SOURCE_AUDIO_PCM, stream); ``` **Java** ```java s.setStream(Snsr.SOURCE_AUDIO_PCM, stream); ``` **Python** ```python s.set_stream(snsr.SOURCE_AUDIO_PCM, stream) ``` Input audio stream. This audio stream must be headerless, 16-bit LPCM-encoded, sampled at 16 kHz, with little-endian byte ordering. **Also see these related items:** [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push), [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) ### ->feature - runtime - input stream - write-only - pre-release **C/C++** ```c snsrSetStream(s, SNSR_SOURCE_FEATURE, stream); ``` **Java** ```java s.setStream(Snsr.SOURCE_FEATURE, stream); ``` **Python** ```python s.set_stream(snsr.SOURCE_FEATURE, stream) ``` Input feature stream. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. Source stream in a proprietary Sensory format. ### <-audio-pcm - runtime - output stream - write-only **C/C++** ```c snsrSetStream(s, SNSR_SINK_AUDIO_PCM, stream); ``` **Java** ```java s.setStream(Snsr.SINK_AUDIO_PCM, stream); ``` **Python** ```python s.set_stream(snsr.SINK_AUDIO_PCM, stream) ``` Output audio stream. Headerless audio output, 16-bit LPCM-encoded, sampled at 16 kHz, with little-endian byte ordering. ### <-feature - runtime - output stream - write-only - pre-release **C/C++** ```c snsrSetStream(s, SNSR_SINK_FEATURE, stream); ``` **Java** ```java s.setStream(Snsr.SINK_FEATURE, stream); ``` **Python** ```python s.set_stream(snsr.SINK_FEATURE, stream) ``` Output feature stream. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. Headerless feature output in a proprietary Sensory format. ### audio-stream-from - runtime - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_AUDIO_STREAM_FROM, value); ``` **Java** ```java s.setDouble(Snsr.AUDIO_STREAM_FROM, value); ``` **Python** ```python s.set_double(snsr.AUDIO_STREAM_FROM, value) ``` Audio stream requested start index. Start the next [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream) at this sample index value. Defaults to [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first). **Also see these related items:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size) ### audio-stream-to - runtime - double - read-write **C/C++** ```c snsrSetDouble(s, SNSR_AUDIO_STREAM_TO, value); ``` **Java** ```java s.setDouble(Snsr.AUDIO_STREAM_TO, value); ``` **Python** ```python s.set_double(snsr.AUDIO_STREAM_TO, value) ``` Audio stream requested end index. End the next [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream) at this sample index value. Defaults to [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last). **Also see these related items:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size) ## VAD & endpointing ### skip-to-ms - runtime - int - write-only **C/C++** ```c snsrSetInt(s, SNSR_SKIP_TO_MS, value); ``` **Java** ```java s.setInt(Snsr.SKIP_TO_MS, value); ``` **Python** ```python s.set_int(snsr.SKIP_TO_MS, value) ``` VAD initial ignore duration, in ms. Ignore the first `skip-to-ms` ms of the [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm) input stream. Use this runtime setting to skip over a trigger phrase included in the source audio. The default is to process all audio. **Also see these related items:** [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample) ### skip-to-sample - runtime - int - write-only **C/C++** ```c snsrSetInt(s, SNSR_SKIP_TO_SAMPLE, value); ``` **Java** ```java s.setInt(Snsr.SKIP_TO_SAMPLE, value); ``` **Python** ```python s.set_int(snsr.SKIP_TO_SAMPLE, value) ``` VAD initial ignore duration, in samples. Ignore the first `skip-to-sample` samples of the [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm) input stream. Use this runtime setting to skip over a trigger phrase included in the source audio. The default is to process all audio. **Also see these related items:** [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms) ## LVCSR & STT ### grammar-stream. - runtime - input stream - write-only - TrulyNatural only **C/C++** ```c snsrSetStream(s, SNSR_GRAMMAR_STREAM, stream); ``` **Java** ```java s.setStream(Snsr.GRAMMAR_STREAM, stream); ``` **Python** ```python s.set_stream(snsr.GRAMMAR_STREAM, stream) ``` Recognition grammar stream. Creates a recognizer from a grammar specification read from a stream. The grammar must use [UTF‑8][UTF-8] encoding. The new model will recognize only those phrases that the grammar generates. The model will be ready to recognize once [setStream](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) returns. For larger grammars the build process can take a significant amount of time. See [grammar syntax](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) for detail on how grammars are structured. To create a grammar for a class, append the class name to `grammar-stream.`: **Example:** **C/C++** ```c snsrSetStream(session, SNSR_GRAMMAR_STREAM "classname", grammarStream); ``` **Java** ```java session.set(Snsr.GRAMMAR_STREAM + "classname", grammarStream); ``` **Python** ```python session.set_stream(snsr.GRAMMAR_STREAM + "classname", grammar_stream) ``` **Note:** Requires a TrulyNatural model that supports building. **Also see these related items:** [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream), [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream) ### nlu-grammar-stream. - runtime - input stream - write-only _(TrulyNatural only)_ _(STT only)_ **C/C++** ```c snsrSetStream(s, SNSR_NLU_GRAMMAR_STREAM, stream); ``` **Java** ```java s.setStream(Snsr.NLU_GRAMMAR_STREAM, stream); ``` **Python** ```python s.set_stream(snsr.NLU_GRAMMAR_STREAM, stream) ``` NLU grammar stream. Creates a lightweight NLU parser from a grammar specification read from a stream. It takes precedence over NLU specified with [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream) and can be used as an alternative to machine-learned NLU in some STT models. The grammar must use [UTF‑8][UTF-8] encoding. The NLU model will recognize only those phrases that the grammar generates. The NLU parser is applied to the recognition result, [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), and generates [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) events for each match found. The model will be ready to recognize once [setStream](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) returns. For larger grammars the build process can take a significant amount of time. See [grammar syntax](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) for detail on how grammars are structured. To create a grammar for a class, append the class name to `nlu-grammar-stream.`: **Example:** **C/C++** ```c snsrSetStream(session, SNSR_NLU_GRAMMAR_STREAM "classname", grammarStream); ``` **Java** ```java SnsrSession.set(Snsr.NLU_GRAMMAR_STREAM + "classname", grammarStream); ``` **Python** ```python session.set_stream(snsr.NLU_GRAMMAR_STREAM + "classname", grammar_stream) ``` **Note:** * Requires a TrulyNatural model that supports building. * If the model includes a machine-learned NLU component and the grammar-based NLU finds a match, this match replaces the machine learned-result completely. Before release 7.8.0 the `nlu-grammar-stream` result replaced the machine-learned result even if it found no matches. **Also see these related items:** [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) ### phrases-stream. - runtime - input stream - write-only - TrulyNatural only **C/C++** ```c snsrSetStream(s, SNSR_PHRASES_STREAM, stream); ``` **Java** ```java s.setStream(Snsr.PHRASES_STREAM, stream); ``` **Python** ```python s.set_stream(snsr.PHRASES_STREAM, stream) ``` Recognition phrase list stream. Creates a recognizer from a list of phrases read from a stream. The phrase list must use [UTF‑8][UTF-8] encoding. Individual phrases are separated by newlines or semicolons. Comments start with `#` and run until the end of the phrase. Only the exact phrases in the list will be part of the recognition language. This utility setting converts the list of phrases into this grammar specification: ``` g = ~~( phrase0 | phrase1 | ... | phraseN)~~ ; ``` To create a phrase list for a class, append the class name to `phrases-stream.`: **Example:** **C/C++** ```c const char *phrases = "hello world; this is a test sentence"; snsrSetStream(session, SNSR_PHRASES_STREAM "classname", snsrStreamFromString(phrases)); ``` **Java** ```java String phrases = "hello world; this is a test sentence"; session.set(Snsr.PHRASES_STREAM + "classname", SnsrStream.fromString(phrases)); ``` **Python** ```python phrases = "hello world; this is a test sentence" session.set_stream( snsr.PHRASES_STREAM + "classname", snsr.Stream.from_string(phrases), ) ``` **Note:** Requires a TrulyNatural model that supports building. **Also see these related items:** [Stream](https://doc.sensory.com/tnl/7.8/api/io.md#stream), [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream) ## Enrollment & adaptation ### add-context - runtime - int - read-write **C/C++** ```c snsrSetInt(s, SNSR_ADD_CONTEXT, value); ``` **Java** ```java s.setInt(Snsr.ADD_CONTEXT, value); ``` **Python** ```python s.set_int(snsr.ADD_CONTEXT, value) ``` Current enrollment includes trailing context. Set to `1` if the enrollment recording should include trailing context, for example: `"Hey Sensory will it rain tomorrow?"` **Also see these related items:** [^resume](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#resume) ### delete-user - runtime - string - write-only **C/C++** ```c snsrSetString(s, SNSR_DELETE_USER, "text"); ``` **Java** ```java s.setString(Snsr.DELETE_USER, "text"); ``` **Python** ```python s.set_string(snsr.DELETE_USER, "text") ``` Delete an enrolled user. Deletes the named user, then: * invokes [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled) event, * invokes [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) event iff any enrolled users remain, * invokes [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done) event. **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled) ### re-adapt - runtime - int - write-only **C/C++** ```c snsrSetInt(s, SNSR_RE_ADAPT, value); ``` **Java** ```java s.setInt(Snsr.RE_ADAPT, value); ``` **Python** ```python s.set_int(snsr.RE_ADAPT, value) ``` Force re-adaptation of all enrollments. If all users in an enrollment task have been adapted, the adaptation step is skipped. This is the case when one or more adapted enrollment contexts are loaded, and no new users are added. Setting `re-adapt` to `1` changes this behavior to always do the adaptation step. **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) ### rename-user - runtime - string - write-only **C/C++** ```c snsrSetString(s, SNSR_RENAME_USER, "text"); ``` **Java** ```java s.setString(Snsr.RENAME_USER, "text"); ``` **Python** ```python s.set_string(snsr.RENAME_USER, "text") ``` Rename an enrolled user. Changes the recognition result returned for [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) to the string argument. This setting is only available in fixed-phrase spotters that support continuous adaptation. **Example:** **C/C++** ```c snsrSetString(s, SNSR_USER, "original-user/phrase"); snsrSetString(s, SNSR_RENAME_USER, "new-user/new-phrase"); ``` **Java** ```java session.setString(Snsr.USER, "original-user/phrase"); session.setString(Snsr.RENAME_USER, "new-user/new-phrase"); ``` **Python** ```python session.set_string(snsr.USER, "original-user/phrase") session.set_string(snsr.RENAME_USER, "new-user/new-phrase") ``` **Also see these related items:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user) ## Push / pull execution ### auto-flush - runtime - int - read-write - deprecated [6.20.0](https://doc.sensory.com/tnl/7.8/changes/version-6.md#v6.20.0) **C/C++** ```c snsrSetInt(s, SNSR_AUTO_FLUSH, value); ``` **Java** ```java s.setInt(Snsr.AUTO_FLUSH, value); ``` **Python** ```python s.set_int(snsr.AUTO_FLUSH, value) ``` Recognition pipeline end-of-stream flush behavior. **Deprecated:** Use [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) and [stop](https://doc.sensory.com/tnl/7.8/api/inference.md#stop) in new code instead. This boolean value controls whether [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) flushes the recognition pipeline when one (or more) of the input streams report an end-of-file condition. The default value is `1`, which enables automatic flushing on [EOF](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_eof). This is appropriate for most applications. Set `auto-flush` to `0` when audio is presented to [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run) in small segments. **Also see these related items:** [run](https://doc.sensory.com/tnl/7.8/api/inference.md#run), [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push), [stop](https://doc.sensory.com/tnl/7.8/api/inference.md#stop) ## THF Micro DSP ### dsp-acmodel-stream - runtime - output stream - read-only **C/C++** ```c SnsrStream stream; snsrGetStream(s, SNSR_EMBEDDED_ACMODEL_STREAM, &stream); ``` **Java** ```java SnsrStream stream = s.getStream(Snsr.EMBEDDED_ACMODEL_STREAM); ``` **Python** ```python stream = s.get_stream(snsr.EMBEDDED_ACMODEL_STREAM) ``` Embedded device acoustic model data. **Also see these related items:** [THF Micro][], [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [dsp.production-ready](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspproduction-ready), [dsp.t-slice-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspt-slice-version) ### dsp-header-stream - runtime - output stream - read-only **C/C++** ```c SnsrStream stream; snsrGetStream(s, SNSR_EMBEDDED_HEADER_STREAM, &stream); ``` **Java** ```java SnsrStream stream = s.getStream(Snsr.EMBEDDED_HEADER_STREAM); ``` **Python** ```python stream = s.get_stream(snsr.EMBEDDED_HEADER_STREAM) ``` Embedded device search header. **Also see these related items:** [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target),[dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [dsp.t-slice-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspt-slice-version) ### dsp-search-stream - runtime - output stream - read-only **C/C++** ```c SnsrStream stream; snsrGetStream(s, SNSR_EMBEDDED_SEARCH_STREAM, &stream); ``` **Java** ```java SnsrStream stream = s.getStream(Snsr.EMBEDDED_SEARCH_STREAM); ``` **Python** ```python stream = s.get_stream(snsr.EMBEDDED_SEARCH_STREAM) ``` Embedded device search model data. **Also see these related items:** [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp.t-slice-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspt-slice-version) ### dsp-target - runtime - string - read-write **C/C++** ```c snsrSetString(s, SNSR_EMBEDDED_TARGET, "text"); ``` **Java** ```java s.setString(Snsr.EMBEDDED_TARGET, "text"); ``` **Python** ```python s.set_string(snsr.EMBEDDED_TARGET, "text") ``` Embedded (DSP) device target name. **Also see these related items:** [THF Micro][], [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [dsp.t-slice-version](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#dspt-slice-version) ## Identity & metadata ### model-license-exp-date - runtime - double - read-only **C/C++** ```c double value; snsrGetDouble(s, SNSR_MODEL_LICENSE_EXPDATE, &value); ``` **Java** ```java double value = s.getDouble(Snsr.MODEL_LICENSE_EXPDATE); ``` **Python** ```python value = s.get_double(snsr.MODEL_LICENSE_EXPDATE) ``` Model license expiration date. Returns the license expiration date of the most recently loaded model in seconds since the [epoch][], or `0` if no model is loaded. For production keys, which never expire, the expiration date is `0`. **Example:** ```c double e; time_t expdate; snsrGetDouble(s, SNSR_MODEL_LICENSE_EXPDATE, &e); expdate = (time_t)e; ``` **Also see these related items:** [model-license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-message) ### model-license-exp-message - runtime - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_MODEL_LICENSE_EXPIRES, &value); ``` **Java** ```java String value = s.getString(Snsr.MODEL_LICENSE_EXPIRES); ``` **Python** ```python value = s.get_string(snsr.MODEL_LICENSE_EXPIRES) ``` Model license expiration message. Returns an expiration message string for the most recently loaded model, or `NULL` if no model is loaded. The returned string is of the form `"Model license expires on "`, or `NULL` for model license keys that do not expire. **Example:** ```c const char *expires; snsrGetString(s, SNSR_MODEL_LICENSE_EXPIRES, &expires); if (expires) fprintf(stderr, "%s\n", expires); ``` **Also see these related items:** [model-license-exp-date](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-date), [model-license-exp-warn](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-warn) ### model-license-exp-warn - runtime - string - read-only **C/C++** ```c const char * value; snsrGetString(s, SNSR_MODEL_LICENSE_WARNING, &value); ``` **Java** ```java String value = s.getString(Snsr.MODEL_LICENSE_WARNING); ``` **Python** ```python value = s.get_string(snsr.MODEL_LICENSE_WARNING) ``` Model license expiration warning message. This value is `NULL` for models with license keys that either do not expire, or that have an expiration date that is more than 60 days into the future. For license keys expiring in 60 days or fewer, the returned string will be of the form `"License will expire in 37 days."`. **Also see these related items:** [model-license-exp-date](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-date), [model-license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-message) ### model-name - runtime - string - read-write **C/C++** ```c snsrSetString(s, SNSR_MODEL_NAME, "text"); ``` **Java** ```java s.setString(Snsr.MODEL_NAME, "text"); ``` **Python** ```python s.set_string(snsr.MODEL_NAME, "text") ``` Source model name. The name of the model file used to create: - ROM-able C code with [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) and [SOURCE](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_source). - A model in one of the supported embedded formats with [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream). This value is included in the comment header of the generated code, for information only. ### model:ids - runtime - string - read-write **C/C++** ```c snsrSetString(s, SNSR_PREPARE_SUBSET_INIT, "text"); ``` **Java** ```java s.setString(Snsr.PREPARE_SUBSET_INIT, "text"); ``` **Python** ```python s.set_string(snsr.PREPARE_SUBSET_INIT, "text") ``` Prepares a [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) for generating custom initialization code. Set this value to `NULL` to enable [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load) instrumentation. This is used by [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) with [SUBSET_INIT](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_subset_init) to generate custom library initialization code (in C), which references only modules encountered during [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load). Linking against only a subset of available modules reduces executable size. This example will create a `custom-init.c` file. Add this to the build, and recompile with `-DSNSR_USE_SUBSET` to enable. The linked executable(s) will contain just the modules required to run `model-a.snsr` and `model-b.snsr`. See [Compile-time macros § SNSR_USE_SUBSET](https://doc.sensory.com/tnl/7.8/api/compile-macros.md#snsr-use-subset). **C/C++** **Example:** ```c SnsrSession s; snsrNew(&s); snsrSetString(s, SNSR_PREPARE_SUBSET_INIT, NULL); snsrLoad(s, snsrStreamFromFileName("model-a.snsr", "r")); snsrLoad(s, snsrStreamFromFileName("model-b.snsr", "r")); snsrSave(s, SNSR_FM_SUBSET_INIT, snsrStreamFromFileName("custom-init.c", "w")); snsrRelease(s); ``` **Also see these related items:** [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load), [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save), [SUBSET_INIT](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_subset_init) ### prune:enable - runtime - string - read-write **C/C++** ```c snsrSetString(s, SNSR_PRUNE_SETTINGS, "text"); ``` **Java** ```java s.setString(Snsr.PRUNE_SETTINGS, "text"); ``` **Python** ```python s.set_string(snsr.PRUNE_SETTINGS, "text") ``` Prepare a [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) for pruning model settings Pruning unused settings from a model reduces peak RAM requirements. This is typically only useful on platforms where heap memory is constrained. Set `prune:enable` to `yes` to instrument model loading and running. This instrumentation: * Keeps track of which configuration settings the application accesses during model evaluation, * adds a list of these settings to the model upon saving, and * configures the model to prune settings not on this list from the runtime directly after loading. Set `prune:enable` to `no` to disable pruning. This is the default. **Note:** Pruned models do not contain enough information to be re-saved. **Also see these related items:** [load](https://doc.sensory.com/tnl/7.8/api/inference.md#load), [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) ### tag-identifier - runtime - string - read-write **C/C++** ```c snsrSetString(s, SNSR_TAG_IDENTIFIER, "text"); ``` **Java** ```java s.setString(Snsr.TAG_IDENTIFIER, "text"); ``` **Python** ```python s.set_string(snsr.TAG_IDENTIFIER, "text") ``` Exported identifier in ROM C code. When text segment C code is created with [save](https://doc.sensory.com/tnl/7.8/api/inference.md#save) and [SOURCE](https://doc.sensory.com/tnl/7.8/api/inference.md#fm_source), this setting specifies the name of the exported data structure. The value must start with an ASCII alphabetic character or `_`, and contain only alphanumerics and `_`; it must match regular expression `[A-Za-z_][A-Za-z0-9_]*` [epoch]: https://en.wikipedia.org/wiki/Unix_time "Unix time" [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" [UTF-8]: https://en.wikipedia.org/wiki/UTF-8 *[API]: Application Programming Interface *[iff]: if, and only if *[LPCM]: Linear pulse-code modulation *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[RAM]: Random Access Memory *[ROM]: Read-Only Memory, typically nonvolatile flash memory *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "api/setting-keys/values.md" canonical_url: "https://doc.sensory.com/tnl/7.8/api/setting-keys/values/" --- # Values These are string constants that define [task-types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) used with [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), or that identify template [slots](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot). ### Details: Page conventions Settings are grouped by concern. Within each group they're listed alphabetically. A setting that serves more than one concern appears once under its primary group; secondary groups link to it. The code tabs on each setting show one paste-able call site per language. Adapt the placeholder names (`s` for the [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session), plus `value`, `stream`, `on_event`, or `on_item` for the operand) to your code. Each example assumes the language's standard prelude: **C/C++** ```c #include ``` **Java** ```java import com.sensory.speech.snsr.Snsr; import com.sensory.speech.snsr.SnsrSession; ``` **Python** ```python import snsr ``` For the full Session lifecycle (`snsrNew`, `new SnsrSession()`, `snsr.Session(...)`) see [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program). ## enroll - value - string - read-only **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_ENROLL); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.ENROLL); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.ENROLL) ``` Phrase spotter enrollment task type. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) ## feature - value - string - read-only - pre-release **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_FEATURE); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.FEATURE); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.FEATURE) ``` Feature extractor task type. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) ## feature-lvcsr - value - string - read-only _(TrulyNatural only)_ _(pre-release)_ **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_FEATURE_LVCSR); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.FEATURE_LVCSR); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.FEATURE_LVCSR) ``` LVCSR recognizer from feature stream task type. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) ## feature-phrasespot - value - string - read-only - pre-release **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_FEATURE_PHRASESPOT); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.FEATURE_PHRASESPOT); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.FEATURE_PHRASESPOT) ``` Phrase spotter from feature stream task type. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) ## feature-vad - value - string - read-only - pre-release **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_FEATURE_VAD); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.FEATURE_VAD); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.FEATURE_VAD) ``` VAD from feature stream task type. **Pre-release:** This is an experimental feature. Do not use unless recommended by Sensory. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) ## lvcsr - value - string - read-only - TrulyNatural only **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_LVCSR); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.LVCSR); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.LVCSR) ``` LVCSR and STT recognizer task type. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), [LVCSR models](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-models), [STT models](https://doc.sensory.com/tnl/7.8/models/index.md#stt-models) ## phrasespot - value - string - read-only **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.PHRASESPOT); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.PHRASESPOT) ``` Phrase spotter task type. Also known as wake words, key word spotting, and spotted command sets. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), [wake word models](https://doc.sensory.com/tnl/7.8/models/index.md#wake-word-models) ## phrasespot-vad - value - string - read-only **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_PHRASESPOT_VAD); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.PHRASESPOT_VAD); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.PHRASESPOT_VAD) ``` Phrase Spotter VAD task type. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type) ## vad - value - string - read-only **C/C++** ```c snsrRequire(s, SNSR_TASK_TYPE, SNSR_VAD); ``` **Java** ```java s.require(Snsr.TASK_TYPE, Snsr.VAD); ``` **Python** ```python s.require(snsr.TASK_TYPE, snsr.VAD) ``` VAD task type. **Also see these related items:** [require](https://doc.sensory.com/tnl/7.8/api/inference.md#require), [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), [VAD models](https://doc.sensory.com/tnl/7.8/models/index.md#vad-models) *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "changes/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/changes/" --- # Version 7 changes (current) TrulyNatural version 7 introduced support for STT and is fully backwards compatible with [version 6](https://doc.sensory.com/tnl/7.8/changes/version-6.md#v6-changes) models and code. This library uses [semantic versioning][semver]. ## 7.8.0 (2026-05-30) - **Added** - License key introspection: [LICENSE_INFO](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license_info), [LICENSE_SUPPORT](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license_support), [LICENSE_OVERRIDE_NOT_VALID](https://doc.sensory.com/tnl/7.8/api/inference.md#rc). - [profile:samples](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#profilesamples). - Python language binding: platform-specific `snsr` wheel (ctypes over the C API) for [Python][] 3.10 and later, with sample projects in _sample/python/_ and documentation ([Python examples](https://doc.sensory.com/tnl/7.8/api/sample/python/index.md#python-examples), [Integrate with your build § Python](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-python), and the Python tab in [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program)). - [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval) `-i` flag to support [batch processing](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#batch-processing). - LLM-friendly documentation Markdown bundle for coding agents and RAG pipelines. Each page is mirrored on the site at the same path as its source file—section roots as `

![TrulyNatural SDK](assets/tnl.svg){ .snsr-hero__logo } # TrulyNatural SDK On-device speech recognition for embedded systems, mobile, and desktop.

v7.9.0-pre.0

- •{ .lg .middle } **What's in the SDK** --- [TrulyHandsfree][thf] (wake words and commands), [TrulyNatural Lite][tnl-lite] (LVCSR), and [TrulyNatural STT][tnl-stt] (transformer transcription) — three variants, one API. Compare features, supported platforms, and host requirements. :octicons-arrow-right-24: [Reference overview](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-overview) - •{ .lg .middle } **Get started** --- Install the SDK and try wake words, LVCSR, or STT with the command-line tools (`snsr-eval`, `snsr-edit`). When you are ready to embed, build a Session API sample for your platform — embedded, mobile, or desktop. :octicons-arrow-right-24: [Quick start](https://doc.sensory.com/tnl/7.8/getting-started/index.md#getting-started) :octicons-arrow-right-24: [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program) - •{ .lg .middle } **API reference** --- Native [C][] API plus [Java][] and [Python][] bindings (Android Kotlin uses the Java binding directly), iOS via a [bridging header][]. Function-level documentation, types, and error codes. :octicons-arrow-right-24: [API reference](https://doc.sensory.com/tnl/7.8/api/index.md#api-reference)

## More [Changelog](https://doc.sensory.com/tnl/7.8/changes/index.md#v7-changes) - User-visible changes by SDK version. [Command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools) - `snsr-eval`, `snsr-build`, and friends. [FAQ](https://doc.sensory.com/tnl/7.8/faq.md#faq) - Frequently Asked Questions. [Coding agents](https://doc.sensory.com/tnl/7.8/agent-tools.md#coding-agents) - Point Claude, Cursor, Aider, Continue, or other AI coding tools at this doc set so they can answer SDK questions without extra setup. [Contact information](https://doc.sensory.com/tnl/7.8/contact.md#contact) - How to reach Sensory engineering and sales. [Licenses](https://doc.sensory.com/tnl/7.8/licenses/index.md#sensory-sdk-license) - Legal agreements. [bridging header]: https://developer.apple.com/documentation/swift/importing-objective-c-into-swift "Importing Objective-C into Swift" [C]: https://en.wikipedia.org/wiki/C_(programming_language) "C programming language" [Java]: https://en.wikipedia.org/wiki/Java_(programming_language) "Java programming language" [Python]: https://en.wikipedia.org/wiki/Python_(programming_language) "Python programming language" [thf]: https://www.sensory.com/wake-word/ "Low Power Wake Words & Phrase Recognition Engine" [tnl-lite]: https://www.sensory.com/natural-language-understanding/ "Large Vocabulary Continuous Speech Recognition (LVCSR) with Dynamic Natural Language Understanding" [tnl-stt]: https://www.sensory.com/embedded-speech-to-text/ "Embedded Speech To Text" *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "licenses/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/licenses/" --- # Sensory SDK license ## Sensory Inc. Software Developer's Kit ("SDK") **Note:** NOTICE TO USER: PLEASE READ THIS CONTRACT CAREFULLY. BY USING ALL OR ANY PORTION OF THE SDK YOU ACCEPT ALL THE TERMS AND CONDITIONS OF THIS LICENSE AGREEMENT, INCLUDING, IN PARTICULAR THE LIMITATIONS ON: USE CONTAINED IN SECTIONS 2, 3 AND 4; WARRANTY IN SECTION 7; AND LIABILITY IN SECTION 8. YOU AGREE THAT THIS LICENSE AGREEMENT IS ENFORCEABLE LIKE ANY WRITTEN NEGOTIATED AGREEMENT SIGNED BY YOU. IF YOU DO NOT AGREE, DO NOT USE THIS SDK. IF YOU ACQUIRED THE SDK ON TANGIBLE MEDIA (FOR EXAMPLE, CD-ROM) WITHOUT AN OPPORTUNITY TO REVIEW THIS LICENSE, AND YOU DO NOT ACCEPT THIS LICENSE AGREEMENT, YOU MAY NOT USE THE SDK. ### 1. Definitions "SDK" means all of the contents of the files, disk(s), CD-ROM(s) or other media with which this License Agreement is provided, including but not limited to (i) Sample Code; (ii) Header File Information; (iii) Redistributable Code, (iv) Documentation; and (v) any upgrades, modified versions, updates, and/or additions thereto, if any, provided to You by Sensory. "Sample Code" means sample software in source code format designated in the Documentation as "Sample Code." "Header File Information" means any header files (*.h files) supplied in connection with the SDK, including without limitation any related information detailing contents of header files. "Redistributable Code" means certain object code files designated in the Documentation as "Redistributable Code." "Documentation" means explanatory materials supplied with the SDK or made available online on Sensory public web pages related to the SDK. "Sensory" means Sensory. Inc., a California corporation, 3150 De La Cruz Blvd., Suite 120, Santa Clara, CA 95054. "Sensory Software" means the generally commercially available versions of Sensory TrulyNatural SDK. "Developer Programs" means Your application programs that are designed to function with Sensory Software products. "Developer," "You," and "Your" refer to any person or entity accessing or using this SDK, or any component thereof. "End User License Agreement" means an end user license agreement that provides a (a) limited, nonexclusive right to use the subject Developer Program with no further right to reproduce (except for archival and/or backup copies permitted by law) and/or distribute the subject Developer Program, (b) prohibition against distributing, selling, sublicensing, renting, loaning or leasing the subject Developer Program, (c) prohibition against reverse engineering, decompiling, disassembling or otherwise attempting to discover the source code of the subject Developer Program that is substantially similar to that set forth in Section 3 below, (d) statement that You and your suppliers retain all right, title and interest in the subject Developer Program that is substantially similar to that set forth as Section 5 below, (e) statement that Your suppliers disclaim all warranties, conditions, representations or terms with respect to the subject Developer Program substantially similar to the disclaimer set forth as Section 7 below, and (f) limit of liability substantially similar to that set forth as Section 8 below for the benefit of Your suppliers. ### 2. License Subject to the terms and conditions of this License Agreement, Sensory grants You a non-exclusive, nontransferable, royalty-free license to (a) use the SDK for the sole purpose of internally developing Developer Programs, (b) reproduce and modify Sample Code as a component of Developer Programs that add significant and primary functionality to the Sample Code, (c) reproduce Redistributable Code solely as a component of Developer Programs that add significant and primary functionality to the Redistributable Code and (d) distribute Sample Code and/or Redistributable Code in object code form only as a component of Developer Programs that add significant and primary functionality to the Sample Code and/or Redistributable Code provided that (i) You distribute such object code under the terms and conditions of an End User License Agreement, (ii) You include a copyright notice reflecting the copyright ownership of Developer in such Developer Programs, (iii) You shall be solely responsible to Your customers for any update or support obligation or other liability which may arise from such distribution, (iv) You shall not make any statements that Your Developer Product is "certified," or that its performance is guaranteed, by Sensory, and (v) You do not use Sensory's name or trademarks to market Your Developer Product without written permission of Sensory. Any modified or merged portion of the Sample Code, and/or merged portion of the Redistributable Code, IS subject to this License Agreement. Use of Sensory Software and/or any other Sensory application program is subject to the applicable end user license agreement for such application software even if such Sensory Software is supplied to You in connection with this License Agreement. You may make a limited number of copies of the Documentation to be used by Your employees or consultants for internal development purposes and not for general business purposes or for distribution by any means, and such employees or consultants shall be subject to this License Agreement. You may distribute up to five instances of Your Developer Program with Sensory Software to third parties under this agreement. You may distribute more than five instances of Sensory Software with Your Developer Programs only under separate license from Sensory. Sensory is under no obligation to provide any support under this License Agreement, including upgrades or future versions of the SDK, Sensory Software and/or any component thereof, to Developer, end users, or to any other party. Further developer support, software licensing, trademark licensing and trademark usage information is available through www.Sensoryinc.com. ### 3. Restrictions Except for the limited distribution rights as provided in Section 2 above with respect to Sample Code, Redistributable Code, and Developer Programs, You may not distribute, sell, sublicense, rent, loan, or lease the SDK, Sensory Software, and/or any component thereof to any third party. You also agree not to reverse engineer, decompile, disassemble or otherwise attempt to discover the source code of the SDK, Sensory Software and/or any component thereof except to the extent (i) you may be expressly permitted to decompile under applicable law, (ii) it is essential to do so in order to achieve operability of the SDK or Sensory Software with another software program, and (iii) you have first requested Sensory to provide the information necessary to achieve such operability and Sensory has not made such information available. Sensory has the right to impose reasonable conditions and to request a reasonable fee before providing such information. Any information supplied by Sensory or obtained by you, as permitted hereunder, may only be used by you for the purpose described herein and may not be disclosed to any third party or used to create any software which is substantially similar to the expression of the SDK and/or Sensory Software. ### 4. Confidential Information You agree not to disseminate or in any way disclose Header File Information to any person, firm or business except for Your employees who need to know such Header File Information and who have previously agreed to be bound by a confidentiality obligation consistent with the obligation set forth in this Section 4. Further, You agree to treat the Header File Information with the same degree of care as You accord to Your own confidential information, but in any event no less than reasonable care. Your obligations under this section with respect to the Header File Information shall terminate when You can document that such Header File Information was (i) in the public domain at or subsequent to the time it was communicated to You by Sensory through no fault of yours, (ii) developed by Your employees or agents independently of and without reference to any information communicated to You by Sensory; or (iii) disclosed in response to a valid order by a court or other governmental body, as otherwise required by law, or as necessary to establish the rights of either party under this License Agreement. ### 5. Proprietary Rights You agree to protect Sensory's copyright and other ownership interests in all items in this SDK. You agree that all copies of items in this SDK reproduced for any reason by You will contain the same copyright, trademark, and other proprietary notices as appropriate and appear on or in the master items delivered by Sensory in this SDK. Sensory and/or its suppliers retain all right, title and ownership throughout the world in the intellectual property embodied within the SDK. Except as stated herein, this License Agreement does not grant You any rights to patents, copyrights, trade secrets, trademarks, or any other rights in respect to the items in this SDK. ### 6. Term This License Agreement is effective until terminated. Sensory has the right to terminate Your License immediately if You fail to comply with any term of this License Agreement. Upon any such termination, You must (a) return all full and partial copies of the items in this SDK immediately to Sensory and (b) discontinue distribution of any Sample Code and/or Redistributable Code. Sections 1, 3, 4, 5, 6, 7, 8, 9, 11 and 12 shall survive any termination and/or expiration of this License Agreement. ### 7. Disclaimer of Warranty Sensory licenses the SDK to You on an "AS IS" basis and without warranty of any kind. SENSORY AND ITS SUPPLIERS DO NOT AND CANNOT WARRANT THE PERFORMANCE OR RESULTS YOU MAY OBTAIN BY USING THE SDK. EXCEPT FOR ANY WARRANTY, CONDITION, REPRESENTATION OR TERM TO THE EXTENT TO WHICH THE SAME CANNOT OR MAY NOT BE EXCLUDED OR LIMITED BY LAW APPLICABLE TO YOU IN YOUR JURISDICTION, SENSORY AND ITS SUPPLIERS MAKE NO WARRANTIES, CONDITIONS, REPRESENTATIONS OR TERMS, EXPRESS OR IMPLIED, WHETHER BY STATUTE, COMMON LAW, CUSTOM, USAGE OR OTHERWISE AS TO THE SDK OR ANY COMPONENT THEREOF, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT OF THIRD PARTY RIGHTS, INTEGRATION, MERCHANTABILITY, SATISFACTORY QUALITY OR FITNESS FOR ANY PARTICULAR PURPOSE. Some states or provinces do not allow the exclusion of implied warranties so the above limitations may not apply to You. You may have rights that vary from jurisdiction to jurisdiction. For further warranty information, You may contact the Sensory at the address provided above. ### 8. Limitation of Liability IN NO EVENT WILL SENSORY OR ITS SUPPLIERS BE LIABLE TO YOU FOR ANY DAMAGES, CLAIMS OR COSTS WHATSOEVER ARISING FROM THIS LICENSE AGREEMENT AND/OR YOUR USE OF THE SDK OR ANY COMPONENT THEREOF, INCLUDING WITHOUT LIMITATION ANY CONSEQUENTIAL, INDIRECT, INCIDENTAL DAMAGES, OR ANY LOST PROFITS OR LOST SAVINGS, EVEN IF AN SENSORY REPRESENTATIVE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH LOSS, DAMAGES, CLAIMS OR COSTS OR FOR ANY CLAIM BY ANY THIRD PARTY. THE FOREGOING LIMITATIONS AND EXCLUSIONS APPLY TO THE EXTENT PERMITTED BY APPLICABLE LAW IN YOUR JURISDICTION. SENSORY'S AGGREGATE LIABILITY AND THAT OF ITS SUPPLIERS UNDER OR IN CONNECTION WITH THIS LICENSE AGREEMENT SHALL BE LIMITED TO FIFTY U.S. DOLLARS ($50.00). Nothing contained in this License Agreement limits Sensory's liability to You in the event of death or personal injury resulting from Sensory's negligence or for the tort of deceit (fraud). Sensory is acting on behalf of its suppliers for the purpose of disclaiming, excluding and/or limiting obligations, warranties and liability as provided in this License Agreement, but in no other respects and for no other purpose. ### 9. Indemnification You agree to defend, indemnify, and hold Sensory and its suppliers harmless from and against any claims or lawsuits, including attorneys' reasonable fees, that arise or result from the use or distribution of Developer Programs, provided that Sensory gives You prompt written notice of any such claim, tenders to You the defense or settlement of such a claim at Your expense, and cooperates with You, at Your expense, in defending or settling such claim. ### 10. Government Regulations You agree that any Developer Program that includes Sample Code and/or Redistributable Code (i) will include in its license agreement a reference to applicable U.S. Government regulations which control licensing of software and (ii) will not be shipped, transferred, or exported into any country or used in any manner prohibited by the United States Export Administration Act or any other export laws, restrictions or regulations (collectively the "Export Laws"). In addition, if any part of the SDK is identified as export controlled items under the Export Laws, you represent and warrant that you are not a citizen, or otherwise located within, an embargoed nation (including without limitation Iran, Iraq, Syria, Sudan, Libya, Cuba, North Korea and Serbia) and that you are not otherwise prohibited under the Export Laws from receiving the SDK. All rights to use the SDK are granted on condition that such rights are forfeited if you fail to comply with the terms of this License Agreement. ### 11. Governing Law This License Agreement will be governed by and construed in accordance with the substantive laws in force in the State of California. The courts of Santa Clara County, California shall each have exclusive jurisdiction over all disputes relating to this License Agreement. This License Agreement will not be governed by the conflict of law rules of any jurisdiction or the United Nations Convention on Contracts for the International Sale of Goods, the application of which is expressly excluded. ### 12. General You may not assign Your rights or obligations granted under this License Agreement without the prior written consent of Sensory. None of the provisions of this License Agreement shall be deemed to have been waived by any act or acquiescence on the part of Sensory, its agents, or employees, but only by an instrument in writing signed by an authorized signatory of Sensory. It is expressly agreed that a breach of Section 3 or 4 of this License Agreement will cause irreparable harm to Sensory and that a remedy at law will be inadequate. Therefore, in addition to any and all remedies available at law, Sensory will be entitled to seek an injunction or other equitable remedies in all legal proceedings in the event of any threatened or actual violation thereof. When conflicting language exists between this License Agreement and any other agreement included in this SDK (except for the Integration Key License Agreement or any agreement supplied with Sensory Software), this License Agreement shall supersede. If either Sensory or Developer employs attorneys to enforce any rights arising out of or relating to this License Agreement, the prevailing party shall be entitled to recover reasonable attorneys' fees. You acknowledge that You have read this License Agreement, understand it, and that it is the complete and exclusive statement of Your agreement with Sensory which supersedes any prior agreement, oral or written, between Sensory and You with respect to the licensing to You of this SDK. No variation of the terms of this License Agreement will be enforceable against Sensory unless Sensory gives its express consent in a writing signed by an authorized signatory of Sensory. Sensory, TrulyHandsfree, TrulyNatural, and the Sensory logo, are either trademarks or registered trademarks of Sensory, Inc. in the United States and/or other countries. *[ROM]: Read-Only Memory, typically nonvolatile flash memory *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "licenses/oss.md" canonical_url: "https://doc.sensory.com/tnl/7.8/licenses/oss/" --- # Open Source licenses One or more of the libraries included in this TrulyNatural SDK uses third-party Open Source components with [permissive license agreements][oss-permissive]. See the _README\*.md_ files in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib//_ for library-specific details. [oss-permissive]: https://en.wikipedia.org/wiki/Permissive_software_license You can omit all Open Source Software from a TrulyNatural binary by: - Compiling with `-DSNSR_OMIT_OSS_COMPONENTS` (see [Compile-time macros § SNSR_OMIT_OSS_COMPONENTS](https://doc.sensory.com/tnl/7.8/api/compile-macros.md#snsr-omit-oss-components)) - or by using custom initialization with models that do not require these components. See [model:ids](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#modelids), [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit), and [reduce code size](https://doc.sensory.com/tnl/7.8/faq.md#reduce-code-size). Query [oss-components](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#oss-components) to determine which of these modules are linked into an application. ### Details: [WebRTC](https://webrtc.googlesource.com/src/+/refs/heads/main/LICENSE) ### Details: [ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/v1.21.1/LICENSE) ### Details: [ONNX Runtime dependencies](https://github.com/microsoft/onnxruntime/blob/v1.21.1/ThirdPartyNotices.txt) [oss-permissive]: https://en.wikipedia.org/wiki/Permissive_software_license "Grants use rights, forbids almost nothing" *[API]: Application Programming Interface *[OSS]: Open-source software *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "models/downloads.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/downloads/" --- # Downloads _(STT only)_ The following STT models are available for download. These are compatible with TrulyNatural STT SDK [7.7.0](https://doc.sensory.com/tnl/7.8/changes/index.md#v7.7.0) and later. Contact your account representative or [Sensory sales](https://doc.sensory.com/tnl/7.8/contact.md#contact) for additional languages and customizations. **Filename key:** `opt-vg-vad-stt-` - These are pipelines made from the [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr) template with a US English "Voice Genie" wake word in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) and an STT recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). `-B` - Model includes an NLU component that identifies intents and entities. `-pnc` - Model includes punctuation and capitalization. `-slm` - Model includes a small generative language model. Larger models are more accurate but also require more CPU cycles. | Language { data-sort-default } | Domain | Size in MiB { data-sort-method="number" } | Model | |:---------------------------------|:-----------|--------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------| | English (US) | automotive | 226 | [opt-vg-vad-stt-enUS-automotive-large-1.3.14-B-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-automotive-large-1.3.14-B-pnc_66.snsr) | | English (US) | automotive | 91 | [opt-vg-vad-stt-enUS-automotive-medium-2.3.14-B-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-automotive-medium-2.3.14-B-pnc_66.snsr) | | English (US) | automotive | 49 | [opt-vg-vad-stt-enUS-automotive-small-2.3.14-B-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-automotive-small-2.3.14-B-pnc_66.snsr) | | English (US) | general | 11 | [opt-vg-vad-stt-enUS-general-micro-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-micro-2.0.3_66.snsr) | | English (US) | general | 7 | [opt-vg-vad-stt-enUS-general-nano-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-nano-2.0.3_66.snsr) | | English (US) | general | 199 | [opt-vg-vad-stt-enUS-general-large-2.0.3-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-large-2.0.3-pnc_66.snsr) | | English (US) | general | 67 | [opt-vg-vad-stt-enUS-general-medium-2.4.3-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-medium-2.4.3-pnc_66.snsr) | | English (US) | general | 28 | [opt-vg-vad-stt-enUS-general-small-2.2.3-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-small-2.2.3-pnc_66.snsr) | | English (British) | general | 196 | [opt-vg-vad-stt-enGB-general-large-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enGB-general-large-2.0.3_66.snsr) | | English (British) | general | 64 | [opt-vg-vad-stt-enGB-general-medium-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enGB-general-medium-2.0.3_66.snsr) | | English (British) | general | 25 | [opt-vg-vad-stt-enGB-general-small-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enGB-general-small-2.0.3_66.snsr) | | German | general | 199 | [opt-vg-vad-stt-deDE-general-large-2.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-deDE-general-large-2.2.3_66.snsr) | | German | general | 64 | [opt-vg-vad-stt-deDE-general-medium-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-deDE-general-medium-2.3.3_66.snsr) | | German | general | 25 | [opt-vg-vad-stt-deDE-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-deDE-general-small-2.3.3_66.snsr) | | French | general | 202 | [opt-vg-vad-stt-frFR-general-large-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-frFR-general-large-2.0.3_66.snsr) | | French | general | 64 | [opt-vg-vad-stt-frFR-general-medium-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-frFR-general-medium-2.3.3_66.snsr) | | French | general | 25 | [opt-vg-vad-stt-frFR-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-frFR-general-small-2.3.3_66.snsr) | | Italian | general | 197 | [opt-vg-vad-stt-itIT-general-large-1.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-itIT-general-large-1.2.3_66.snsr) | | Italian | general | 64 | [opt-vg-vad-stt-itIT-general-medium-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-itIT-general-medium-2.3.3_66.snsr) | | Italian | general | 25 | [opt-vg-vad-stt-itIT-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-itIT-general-small-2.3.3_66.snsr) | | Japanese | general | 215 | [opt-vg-vad-stt-jaJP-general-large-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-jaJP-general-large-2.3.3_66.snsr) | | Japanese | general | 64 | [opt-vg-vad-stt-jaJP-general-medium-2.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-jaJP-general-medium-2.2.3_66.snsr) | | Japanese | general | 26 | [opt-vg-vad-stt-jaJP-general-small-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-jaJP-general-small-2.4.3_66.snsr) | | Korean | general | 215 | [opt-vg-vad-stt-koKR-general-large-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-koKR-general-large-2.3.3_66.snsr) | | Korean | general | 64 | [opt-vg-vad-stt-koKR-general-medium-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-koKR-general-medium-2.4.3_66.snsr) | | Korean | general | 25 | [opt-vg-vad-stt-koKR-general-small-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-koKR-general-small-2.4.3_66.snsr) | | Spanish | general | 197 | [opt-vg-vad-stt-esES-general-large-2.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-esES-general-large-2.2.3_66.snsr) | | Spanish | general | 64 | [opt-vg-vad-stt-esES-general-medium-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-esES-general-medium-2.4.3_66.snsr) | | Spanish | general | 25 | [opt-vg-vad-stt-esES-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-esES-general-small-2.3.3_66.snsr) | **Provenance:** The wake word, and the speech-to-text acoustic, language, and NLU models are owned by Sensory and have no third-party dependencies. *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[PNC]: Punctuation and Capitalization, an STT model variant that emits cased text with punctuation *[SDK]: Software Development Kit *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/" --- # Models This distribution includes sample models in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/model/_ You can also [download models](https://doc.sensory.com/tnl/7.8/models/downloads.md#models-downloads) for additional languages in a range of sizes. Console examples in this section assume _$HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0_ as the SDK install directory; replace that prefix if you installed elsewhere. ## Wake words See the [wake word model type](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) for description of model behavior and settings. ### spot-voicegenie-enUS-6.5.1-m.snsr Fixed-phrase "Voice Genie" wake word for US English. ### spot-hbg-enUS-1.4.0-m.snsr Fixed-phrase "Hello Blue Genie" wake word for US English. ### spot-music-enUS-1.2.0-m.snsr Music command set for US English. Commands include "play music", "pause music", "stop music", "previous song", and "next song". ## Adapting wake word See the [adapting wake word model type](https://doc.sensory.com/tnl/7.8/models/types/ca.md#ca-type) for a description of model behavior and settings. ### ca-voicegenie-enUS-1.1.0.snsr This is a fixed-phrase spotter for "Voice Genie" in US English that adapts to users' speech to improve false-accept rates. Model adaptation and enrollment happens automatically and without any additional code requirements — you can use this model as a drop-in replacement for the fixed-phrase [Voice Genie](https://doc.sensory.com/tnl/7.8/models/index.md#spot-voicegenie-enUS) spotter. Configuration settings of particular interest include [cache-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#cache-file) and [max-users](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-users). Use [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), and [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user) to manage user enrollments. **Note:** This model requires support for [multi-threading](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#thread-support). ### Details The reported [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) value changes once enrollment has identified and enrolled a new speaker. ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 # two different speakers saying "voice genie" % bin/snsr-eval -t model/ca-voicegenie-enUS-1.1.0.snsr\ -s cache-file=ca-vg-cache.snsr\ -s max-users=3 2235 3045 voice_genie 6810 7545 voice_genie 11745 12525 user1/voice_genie 16845 17595 user1/voice_genie 29355 30180 voice_genie 34845 35820 user2/voice_genie 37815 38520 user1/voice_genie 40080 40905 user2/voice_genie ^C # restart, loading enrollments from the cache file % bin/snsr-eval -t model/ca-voicegenie-enUS-1.1.0.snsr\ -s cache-file=ca-vg-cache.snsr\ -s max-users=3 12045 13035 user2/voice_genie 15180 15840 user1/voice_genie 17745 18465 user1/voice_genie 20175 20820 user2/voice_genie ^C ``` ## Wake word enrollment See the [wake word enrollment model type](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type) for a description of model behavior and settings. ### eft-hbg-enUS-23.0.0.9.snsr EFT spotter for "Hello Blue Genie", US English. This model produces wake words with a low imposter accept rate. ### udt-universal-3.67.1.0.snsr UDT enrollment. This model creates spotters with nine different operating points and supports multiple languages. Optimized for German, English (Australian, British, Indian, United States), Spanish (European Union, North American), French (European Union), Italian, Korean, Brazilian Portuguese, and Mandarin Chinese. ### udt-enUS-5.1.1.9.snsr UDT enrollment with backwards compatibility. **Note:** This older model produces enrolled wake words with reduced accuracy. Use this model only when targeting a [THF Micro][] 3.x DSP port, or when the wake word is followed by additional validation. ## VAD See the [VAD model type](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) for a description of model behavior and settings. ### vad-ml-3.0.0.snsr Deep-learned stand-alone Voice Activity Detector. ## LVCSR _(TrulyNatural only)_ See the [LVCSR model type](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) for a description of model behavior and settings. ### lvcsr-build-enUS-14.0.2-5MB.snsr _(TrulyNatural only)_ US English recognizer with 4.9 MiB acoustic model and support for [grammar-based recognition](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition). Supports classes and NLU. Use [search.frame-nota](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#searchframe-nota) to adjust out-of-grammar rejection. **Example:** ``` % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % snsr-eval -t model/lvcsr-build-enUS-14.0.2-5MB.snsr\ -s partial-result-interval=0 \ -f grammar-stream data/grammars/enrollments-nlu-slot.txt \ data/enrollments/armadillo-1-2-c.wav NLU intent: navigate (0) = how far away is winco NLU entity: place (0) = winco 285 1995 armadillo how far away is winco ``` ### lvcsr-lib-enUS-14.0.2.snsr _(TrulyNatural only)_ US English class library. This provides pre-compiled classes for common domains. Use [lvcsr-build-enUS](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-build-enUS) to simplify [grammar-based recognition](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition) grammars. See [class libraries](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-class-libraries) for a usage example. _lvcsr-lib-enUS-14.0.2.json_ provides the content of the class description table below in machine-readable format. ### Details: lvcsr-lib-enUS-14.0.2.snsr class library

Class Name	Description
s.alarm-phrases	Basic commands for alarm. - language: enUS - version: 0.0.1 - description: Basic commands for alarm. - detail: Phrases for setting an alarm, including specific times (see s.time). - examples: wake me up at , set alarm for tomorrow, create an alarm for , cancel alarm for , dismiss my alarm for tomorrow, end alarm, stop alarm, dismiss alarm, cancel alarm, snooze alarm, snooze for minutes, cancel all alarms, set alarm for every week day, set my alarm for every - category: command sets
s.alphanumeric	Matches an individual letter (a-z) or an individual integer (0-9). - language: enUS - version: 0.0.2 - description: Matches an individual letter (a-z) or an individual integer (0-9). - detail: Individual alphabet letters including enUK 'zed' and adjectives used for spelling, plus individual numbers zero through nine, including 'oh'. - examples: zero, oh, , , zed, cap , capital , big , lowercase , uppercase , little , double - category: characters
s.call-emergency	Common ways of calling for help in case of emergency. - language: enUS - version: 0.0.1 - description: Common ways of calling for help in case of emergency. - detail: Ways to call emergency services; does not include "help" or "help me". Use this grammar with caution. - examples: , it's an emergency, it is an emergency, this is an emergency, i'm having an emergency, we're having an emergency - category: emergency
s.clock-phrases	Basic commands for setting a clock. - language: enUS - version: 0.0.1 - description: Basic commands for setting a clock. - detail: Phrases for setting a clock, including specific times (see s.time). - examples: time, what time is it, what is the time, what's the time, set time to , change time to , set clock to , change clock to - category: command sets
s.color.extended	Matches individual color names. - language: enUS - version: 0.0.1 - description: Matches individual color names. - detail: Assortment of less common colors. - examples: indigo, teal, beige, olive, bronze, camel, citron, copper, coral, cyan, chestnut, blond, ebony, gold, jade, lavender, mint, denim, garnet, gunmetal, linen, eggshell, eggplant, puce, taupe, silver, vermillion, navy, magenta, mustard, saffron, sage, maroon, tangerine, turquoise, rose, oxblood, violet - category: color
s.color	Matches individual color names. - language: enUS - version: 0.0.4 - description: Matches individual color names. - detail: Primary and secondary colors. - examples: black, blue, brown, gray, green, orange, pink, purple, red, white, yellow - category: color
s.control.car	Simple command set for car voice control. - language: enUS - version: 0.0.1 - description: Simple command set for car voice control. - detail: Basic commands for car voice control, including door controls, window controls, environment controls, basic stereo control, mirror/wiper/lights control. - examples: open driver's side window, close the passenger's side window, roll down front windows, roll up all the windows, lock passenger's side door, child lock back doors, unlock front doors, turn off front defroster, turn on heat, turn up heater, turn down the A C, turn fan up, turn on windshield wipers, open garage, close garage door, turn on navigation, end navigation, turn on radio, turn the front speakers up, turn down the rear speakers, turn up treble, turn the bass down, increase back wiper speed, decrease wiper speed, unfold side mirrors, fold right side mirror, turn on driver's side seat warmer, turn the passenger's side seat warmer down, turn on the dome light - category: command sets
s.control.door	Basic commands for smart lock/door control. - language: enUS - version: 0.0.1 - description: Basic commands for smart lock/door control. - detail: Simple command set for controlling a smart lock/door. - examples: lock the door, unlock my door, open door, close the door, is the door open, is my door open - category: command sets
s.control.environment	General, basic commands for environment control. - language: enUS - version: 0.0.1 - description: General, basic commands for environment control. - detail: Commands for environment control devices, such as fans, AC, space heaters. - examples: turn on, turn off, turn up, turn down, start, stop, set speed to, change speed to, set speed to , change speed to , set to percent, change to percent, set to , change to - category: command sets
s.control.lights	Simple command set for voice-controlled lights. - language: enUS - version: 0.0.1 - description: Simple command set for voice-controlled lights. - detail: Basic commands for voice-controlled lights. - examples: turn on all the hallway lights, turn off closet light, dim the den lights, brighten all the master bath lights, set foyer light to off, turn all the breakfast nook lights on - category: command sets
s.control.media	Basic media control phrases. - language: enUS - version: 0.0.1 - description: Basic media control phrases. - detail: Basic, common controls for music, movies, etc. - examples: play, pause, stop, skip, skip to, next, fast forward, back, rewind, fast rewind, reverse, repeat, start, shuffle - category: command sets
s.control.media.tv	Simple command set for voice-controlled television. - language: enUS - version: 0.0.1 - description: Simple command set for voice-controlled television. - detail: Basic commands for voice-controlled television, see s.switch for specific subcommands for "". - examples: cable, music, browser, apps, streaming, channel , recordings, guide, input , turn on the t v, switch off my television, t v off, television on, switch t v off, turn television on, power on, power off, turn volume to , adjust volume down , turn volume up, mute t v, unmute the television, turn on closed captioning, next channel, channel up, channel down - category: command sets
s.control.media.volume	Simple commands for audio volume control. - language: enUS - version: 0.0.1 - description: Simple commands for audio volume control. - detail: Typical, general commands for controlling audio volume, for home/car smart speaker, television, etc. - examples: increase the , decrease the , turn up the , turn down the , turn the up, turn to five, turn up to ten, turn down to one, mute , louder, softer, quieter, up, down - category: command sets
s.control.phone	Basic commands for calling/messaging control. - language: enUS - version: 0.0.2 - description: Basic commands for calling/messaging control. - detail: Typical, open-ended calling/messaging commands for voice-control assistant on phone. All can be followed by a specific entity/contact name. - examples: send text, send a text to, send voice message, send audio message to, reply to, text to, show message, play message from, show emails, show me emails from, read my recent messages, show my new messages from, play all voice messages, play all voicemail, play new voicemail messages from, send email, send an email to, show contacts, call - category: command sets
s.control.thermostat	Basic commands for thermostat. - language: enUS - version: 0.0.1 - description: Basic commands for thermostat. - detail: Phrases for setting an thermostat, including specific temperatures in C or F (see s.temperature.thermostat.celsius and s.thermostat.fahrenheit). - examples: set thermostat to , turn thermostat up degrees, turn thermostat down degrees, make it much warmer, make it much cooler, make it degrees warmer, make it degrees cooler, what is the temperature - category: command sets
s.control.vacuum	Basic commands for vacuum cleaner. - language: enUS - version: 0.0.1 - description: Basic commands for vacuum cleaner. - detail: Phrases for a voice-controlled vacuum cleaner, including room names to direct vacuum (see s.rooms). - examples: start vacuuming, stop vacuuming, resume vacuuming, end vacuuming, pause vacuuming, unpause vacuuming, start vacuum, stop vacuum, pause vacuum, unpause vacuum, dock vacuum, charge vacuum, where is the vacuum, is the vacuum charging, is the vacuum charged, is vacuum docked, vacuum the , start vacuuming the , resume vacuuming the , end vacuuming the - category: command sets
s.control.virtual-meeting	Basic commands for controlling a virtual meeting platform. - language: enUS - version: 0.0.1 - description: Basic commands for controlling a virtual meeting platform. - detail: Assortment of basic commands for interacting with and controlling a virtual meeting platform. - examples: mute, mute self, mute all, unmute, unmute self, unmute all, initiate new meeting, new meeting, schedule meeting, go to upcoming meetings, past meetings, go to recordings, go to chat, help, test audio, test video, switch microphone, switch video, leave meeting, blur background - category: command sets
s.date	Common ways of saying individual dates. - language: enUS - version: 0.0.1 - description: Common ways of saying individual dates. - detail: Covers dates from January 1, 1800 to December 31, 2099; with and without years. - examples: first of january, the second of february, the third of march two thousand fifteen, the fourth of april two thousand and fifteen, may fifth, june the sixth, july seventh nineteen eighty, august the eighth twenty oh two - category: date
s.duration-queries	General queries about how long something is, to be followed by nouns. - language: enUS - version: 0.0.1 - description: General queries about how long something is, to be followed by nouns. - detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for a variety of duration applications. (Does not include phrases specific to setting a timer; see s.timer-phrases.grm for this application.) - examples: how long is, for how long is, what's the duration of, what is the length of time of, please show me how long is, tell me the duration of, I want to know what the length of time is of - category: commands
s.email	Common ways of spelling out individual email addresses. - language: enUS - version: 0.0.1 - description: Common ways of spelling out individual email addresses. - detail: Produces a spelled-out email address, with common domains or custom, spelled-out domain. - examples: a b c at gmail dot com, h e l l o one two three at yahoo dot com, m y dot e m a i l underscore a d d r e s s at outlook dot com, one dash two dash three at i cloud dot com, a plus b plus c at aol dot com, x y z at hotmail dot com, i j k at ms dot com, e m a i l at m y d o m a i n dot org, e m a i l at y o u r dash d o m a i n dot com - category: email
s.help	Device assistance commands. - language: enUS - version: 0.0.2 - description: Device assistance commands. - detail: Common ways to ask for help with a device. - examples: help, help me, help menu, what do i say, what can i do, how can i use this, how do i use this thing - category: help
s.increase-decrease	Increase/decrease commands. - language: enUS - version: 0.0.2 - description: Increase/decrease commands. - detail: Generic increase and decrease language. No "bump it", "make it quieter/hotter", etc. - examples: turn up, turn it down, decrease, increase, crank it, crank up - category: commands
s.integer-billions	Matches individual long forms of integers from 1 billion to 999 billion. - language: enUS - version: 0.0.1 - description: Matches individual long forms of integers from 1 billion to 999 billion. - detail: Does not include common rounded/float versions (ie. 1.5 billion), or 'a billion' (for easier integration in a larger number set). - examples: one billion, twelve billion five million and three hundred, ninety billion and ninety nine million nine thousand and one, one hundred eighty billion twenty one million, three hundred twenty one billion and eighty two million and two, nine hundred ninety nine billion nine hundred ninety nine million nine hundred ninety nine thousand nine hundred and ninety nine - category: numbers
s.integer-millions	Matches individual long forms of integers from 1 million to 999 million. - language: enUS - version: 0.0.1 - description: Matches individual long forms of integers from 1 million to 999 million. - detail: Does not include common rounded/float versions (ie. 1.5 million), or 'a million' (for easier integration in a larger number set). - examples: one million, one million one hundred thousand, one million one hundred thousand and one, two million and sixteen, three hundred million three thousand and two, nine hundred ninty nine million nine hundred ninety nine thousand nine hundred and ninety nine - category: numbers
s.integer-thousands	Matches individual numbers from one thousand to 999 thousand. - language: enUS - version: 0.0.1 - description: Matches individual numbers from one thousand to 999 thousand. - detail: Does not include 'a thousand', etc, for easier integration in a larger number set. - examples: one thousand, two thousand and one, two thousand eight hundred and ninety nine, nine hundred ninety nine thousand nine hundred and ninety nine - category: numbers
s.letter	Matches an individual letter (a-z). - language: enUS - version: 0.0.2 - description: Matches an individual letter (a-z). - detail: Individual alphabet letters, including enUK "zed", plus optional adjectives used for spelling. - examples: , cap , capital , big , lowercase , uppercase , little , double - category: characters
s.location-queries	General queries to be followed by place names. - language: enUS - version: 0.0.1 - description: General queries to be followed by place names. - detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for location-app-specific purposes. - examples: where is, what are the directions to, map my route to, show the way to, I need a map to, tell me how to get to, how do I get to, guide me to, what's the way I could drive to, what is the way I might find, how do you drive to, tell me how to locate, how would I get to, get me to, how might one drive to, could you please help me find, please tell me where one would locate - category: commands
s.money	Matches individual US currency expressions. - language: enUS - version: 0.0.1 - description: Matches individual US currency expressions. - detail: Combinations of cents and dollars up to 100 dollars. - examples: zero cents, zero dollars, one cent, one dollar, cents, , , and cents, ('two ninety nine'), oh ('three oh five') - category: money
s.noun-queries	General queries about where something is, to be followed by nouns. - language: enUS - version: 0.0.1 - description: General queries about where something is, to be followed by nouns. - detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for a variety of applications. (Does not include location-app-specific queries involving words "drive", "guide", "route", or "map".) - examples: find, get, locate, show, show me, help me find, help me locate, please locate, where is, what's the way to get, what's the way one might get to, what's the way you can get to, what's the way I could locate, what's the way to see, show me where to find, please show me where one would get, tell me how I find, please tell me how one might get, where would I see, where might one locate, how can one get to, how would I find, how do I see, I want to know how one might get to - category: commands
s.number-integer-0-1trillion	Matches individual cardinal numbers zero through one trillion. - language: enUS - version: 0.0.1 - description: Matches individual cardinal numbers zero through one trillion. - detail: Zero through one trillion, with optional 'and' between trillion, millions, thousand, etc. components. - examples: zero, one, ninety eight, one hundred ninety eight, six thousand one hundred and ninety eight, five million six thousand six hundred thousand thirty two, one and a half billion, a trillion, one trillion - category: numbers
s.number-integer-0-9	Matches individual cardinal numbers zero through nine. - language: enUS - version: 0.0.1 - description: Matches individual cardinal numbers zero through nine. - detail: Zero through nine. - examples: zero, one, two, three, four, five, six, seven, eight, nine - category: numbers
s.number-integer-0-100	Matches cardinal numbers zero through one hundred. - language: enUS - version: 0.0.1 - description: Matches cardinal numbers zero through one hundred. - detail: Zero through one hundred, including 'a hundred'. - examples: zero, one, eleven, thirty, thirty four, ninety nine, one hundred, a hundred - category: numbers
s.number-integer-0-999	Matches individual cardinal numbers zero through nine hundred ninety nine. - language: enUS - version: 0.0.3 - description: Matches individual cardinal numbers zero through nine hundred ninety nine. - detail: Zero through nine hundred ninety nine, with optional 'and' between hundreds and tens component. - examples: zero, one, eleven, twenty one, one hundred twelve, nine hundred and ninety nine - category: numbers
s.number-integer-11-19	Matches individual cardinal numbers eleven through nineteen. - language: enUS - version: 0.0.1 - description: Matches individual cardinal numbers eleven through nineteen. - detail: Eleven through nineteen. - examples: eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen - category: numbers
s.number-integer-hundred-thousands	Matches individual cardinal numbers one hundred thousand through nine hundred ninety nine thousand nine hundred ninety nine. - language: enUS - version: 0.0.1 - description: Matches individual cardinal numbers one hundred thousand through nine hundred ninety nine thousand nine hundred ninety nine. - detail: One hundred thousand through nine hundred ninety nine thousand nine hundred ninety nine, with optional 'and' between thousands, hundreds, etc., components. - examples: one hundred thousand, one hundred and two thousand, four hundred twenty one thousand and three hundred, forty one thousand one hundred and three - category: numbers
s.number-ordinal-0-9	Matches individual ordinal numbers 'zeroth' through 'ninth'. - language: enUS - version: 0.0.1 - description: Matches individual ordinal numbers 'zeroth' through 'ninth'. - detail: Zeroth through ninth. - examples: zeroth, first, second, third, fourth, fifth, sixth, seventh, eighth, ninth - category: numbers
s.number-ordinal-10_99	Matches individual cardinal numbers 'tenth' through 'ninety ninth'. - language: enUS - version: 0.0.1 - description: Matches individual cardinal numbers 'tenth' through 'ninety ninth'. - detail: Tenth through ninety ninth. - examples: tenth, eleventh, twenty first, thirtieth, eighty ninth - category: numbers
s.on-off	On and off commands. - language: enUS - version: 0.0.1 - description: On and off commands. - detail: Common expressions for turning devices on and off. - examples: turn off, turn on, switch off, switch on, turn off, turn on, start, stop - category: on-off
s.ordering	General, open-ended phrases for ordering/requesting. - language: enUS - version: 0.0.1 - description: General, open-ended phrases for ordering/requesting. - detail: An assortment of open-ended phrases for ordering which can be combined with specific entities for a variety of applications. - examples: may I have, may I please get, may I try a, may I order one, can I try, can I please order, can I grab two, can I please have three, could I get, could I please have several, could I try that, could I please order this, I'll take, I'll take the, I'd like to have, I'd like to have several, I want, I want three, give me, give me one, please give me, please give me the - category: commands
s.percent	Matches individual percentages using cardinal numbers. - language: enUS - version: 0.0.2 - description: Matches individual percentages using cardinal numbers. - detail: Percents zero to one hundred (with 'percent' unit). - examples: percent, one hundred percent, a hundred percent - category: percent
s.phone-number	Matches individual phone numbers. - language: enUS - version: 0.0.1 - description: Matches individual phone numbers. - detail: Common ways to say 10 digit phone numbers in the US; includes options for 'one' and 'nine' international dialing code, eight-hundred. Limits area codes to 200-900 (US). Interchangeable 'zero' and 'oh'. - examples: one two three four five six seven eight nine oh, four one oh three three oh nine two nine two, one three zero nine three five five two zero two one - category: phone-number
s.rooms	Matches an individual room name. - language: enUS - version: 0.0.2 - description: Matches an individual room name. - detail: Individual names for types of rooms, for homes and businesses. - examples: porch, living room, parlor, entry, entry way, entry room, den, breakfast nook, hallway, mud room, kitchen, bathroom, master bath, master bathroom, restroom, bedroom, guest room, play room, dining room, upstairs, downstairs, laundry room, entrance, basement, pantry, family room, foyer, den, sunroom, library, studio, nursery, office, home office, rec room, recreation room, attic, conference room, conference room, meeting room, reception, reception area, server room, break room, wellness room - category: rooms
s.single-digit-integer	Matches individual numbers one through nine. - language: enUS - version: 0.0.2 - description: Matches individual numbers one through nine. - detail: Does not include zero, as it may not apply to all use cases. - examples: one, two, three, four, five, six, seven, eight, nine - category: numbers
s.single-digit-ordinal	Matches individual ordinal numbers 'first' through 'ninth'. - language: enUS - version: 0.0.2 - description: Matches individual ordinal numbers 'first' through 'ninth'. - detail: First through ninth, does not include 'zeroth'. - examples: first, second, third, fourth, fifth, sixth, seventh, eighth, ninth - category: numbers
s.special-character	Commonly occuring special characters. - language: enUS - version: 0.0.2 - description: Commonly occuring special characters. - detail: Ways to speak common special characters like punctuation. - examples: period, comma, stop, at sign, hashtag, pound sign, dollar sign, plus, curly bracket, right curly brace, open angle bracket, open paren, question mark, exclamation mark, apostrophe, pipe, colon, underscore, carat - category: characters
s.switch	Open-ended commands for general navigation. - language: enUS - version: 0.0.1 - description: Open-ended commands for general navigation. - detail: Common commands for switching from one item to another, in a menu, on a television, and more. - examples: go to, switch to, watch, turn to, change to, put on, tune to - category: commands
s.temperature.oven.celsius	Matches individual temperatures 100 C to 300 C for use with household ovens. - language: enUS - version: 0.0.2 - description: Matches individual temperatures 100 C to 300 C for use with household ovens. - detail: Values from one hundred to three hundred, with optional 'degree' and/or 'celsius' appended. - examples: , degrees, degrees celsius - category: temperature
s.temperature.oven.fahrenheit	Matches individual temperatures 200 C to 500 F for use with household ovens. - language: enUS - version: 0.0.1 - description: Matches individual temperatures 200 C to 500 F for use with household ovens. - detail: Values from two hundred to five hundred with optional "degrees" and "fahrenheit" appended. - examples: , degrees, degrees fahrenheit - category: temperature
s.temperature.thermostat.celsius	Matches individual temperatures 10 C to 40 C (thermostat). - language: enUS - version: 0.0.1 - description: Matches individual temperatures 10 C to 40 C (thermostat). - detail: Values from ten to forty, with optional "degrees" and "celsius", for use with household thermostats. - examples: , degrees, degrees celsius - category: temperature
s.temperature.thermostat.fahrenheit	Matches individual temperatures 40 to 100 (thermostat). - language: enUS - version: 0.0.2 - description: Matches individual temperatures 40 to 100 (thermostat). - detail: Values from forty to one hundred, with optional "degrees" and "fahrenheit", for use with household thermostats. - examples: , degrees, fahrenheit, degrees fahrenheit - category: temperature
s.time	Colloquial time/clock phrases in US English (no 'military' time). - language: enUS - version: 0.0.1 - description: Colloquial time/clock phrases in US English (no 'military' time). - detail: 1 through 12 pm and am, half past/quarter till/ten to etc., o'clock, noon/midnight, afternoon/morning/evening/night matched to their common equivalent numeric hours (with some overlap in evening, night, and afternoon). - examples: five thirteen, seven thirty a m, eight o'clock p m, twenty till eight, five past noon, ten to midnight, a quarter after four, quarter before nine, six thirteen in the morning, eight o'clock in the evening, a quarter past ten o'clock - category: time
s.timer-phrases	Basic commands for setting a timer. - language: enUS - version: 0.0.1 - description: Basic commands for setting a timer. - detail: Phrases for setting a timer, including specific durations (details in s.timer). - examples: the timer, set a timer for , please start the timer for , start a timer, set timer, timer, how much time is left on my timer, wake me in - category: command sets
s.timer	Durations for setting timers and alarms. - language: enUS - version: 0.0.1 - description: Durations for setting timers and alarms. - detail: Seconds, minutes, hours and combinations for setting timers and alarms. - examples: a sec, a second, one second, a minute, one minute, half hour, a half hour, one half hour, an hour, one hour, one hour and a half, an hour and a half, seconds, minutes, hours, minutes seconds, minutes and seconds, hours minutes, hours and minutes - category: timer
s.two-digit-integer	Matches individual two-digit cardinal numbers, 10-99. - language: enUS - version: 0.0.2 - description: Matches individual two-digit cardinal numbers, 10-99. - detail: Ten through ninety nine. - examples: ten, eleven, eighteen, forty six, seventy, ninety nine - category: numbers
s.two-digit-ordinal	Matches individual two-digit ordinal numbers, 10-99. - language: enUS - version: 0.0.2 - description: Matches individual two-digit ordinal numbers, 10-99. - detail: Tenth through ninety ninth. - examples: tenth, eleventh, twenty third, eightieth, ninety ninth - category: numbers
s.verb-queries	General queries about how something is done, to be followed by verbs. - language: enUS - version: 0.0.1 - description: General queries about how something is done, to be followed by verbs. - detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with actions for a variety of applications. - examples: how do I, show me how I, tell me how you, please show me how I can, tell me how I might, please show me how you, please help me to, how to, I want to know how I, how might I, help me, how can I, how to, I want to know how you, I want to know how you can, please tell me how you might - category: commands
s.weight	Matches individual weights, in pounds and ounces. - language: enUS - version: 0.0.2 - description: Matches individual weights, in pounds and ounces. - detail: Combinations of pounds and ounces, up to 100 pounds and 100 ounces. Allows for decimal pound amounts. - examples: one pound, one ounce, pounds, one pound ounces, one pound and ounces, pounds ounces, pounds and ounces, point pounds, one pound one ounce, and a half pounds - category: weight
s.when-queries	General queries about when something is, to be followed by nouns. - language: enUS - version: 0.0.1 - description: General queries about when something is, to be followed by nouns. - detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for a variety of time applications. (Does not include phrases specific to setting a timer; see s.timer-phrases.grm for this application.) - examples: what time is, at what hour is, what's the time of day of, what is the time of, please show me what hour is, please show when I might get, please show me when one might get, please tell me the hour of, tell me when you are going to, show me when is, I want to know when you can get, I want to know the time of, I want to know when is - category: commands
s.yes-no	Common yes/no responses. - language: enUS - version: 0.0.2 - description: Common yes/no responses. - detail: Common ways of saying yes or no. Does not include things like "nah I'm good", "right", "thanks", and "that's it". - examples: yep, yup, yeah, yeah sure, sure, yes, yes please, please, okay, nope, no, no thanks, nah, no thank you - category: yes-no

## STT _(STT only)_ See the [STT model type](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) for a description of model behavior and settings. You can [download](https://doc.sensory.com/tnl/7.8/models/downloads.md#models-downloads) additional models for other languages in a range of different sizes. ### stt-enUS-automotive-medium-2.3.15-pnc.snsr _(STT only)_ STT recognizer with broad-domain support and special focus on automotive command-and-control tasks. It includes a machine-learned NLU component that identifies automotive intents and entities. Results include capitalization and punctuation. This model requires STT support, which currently depends on third-party Open Source modules that are optionally included in the TrulyNatural SDK See [Open Source Licenses](https://doc.sensory.com/tnl/7.8/licenses/oss.md#open-source-licenses) for details. All model components (acoustic, language, and NLU) are owned by Sensory. ### Details: NLU intents and entities | Intent | Entities | Examples | |:-------|:---------|:---------| | activate_car_alarm | | car alarm on
set the car alarm | | activate_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, duration, percentage_value, trunk | turn on the headlights
flash the brights
cabin lights on | | adjust_mirror | front, rear, driver_side, left_side, passenger_side, right_side, side, rearview, direction, adjust_type, percentage_value, length_unit length_unit | adjust driver side mirror
fold in the side mirrors
lower rearview mirror
set the passenger side mirror up one inch | | affirm | | confirm
please | | answer_call | contact_name | accept my call
answer call from ylana | | average_m_p_g | | check gas mileage | | battery_level | number_unit number_unit | check the battery level | | bot_challenge | | am i talking to a human
are you a bot | | call_contact | contact_name, message | call jeff
make a call to marco | | call_emergency | ems | call nine one one
call the police | | call_end | | hang up call | | call_general | | digit dial
make a call | | call_number | phone_number | call seven six five seven three zero six four one five | | camera_off | camera, front, rear, rearview, side | back camera off | | camera_on | camera, front, rear, rearview, side | activate all cameras
turn on the dash cam | | cancel | | cancel
quit | | change_gears | gear | put it in reverse
shift into park | | change_temp_unit | temperature_unit | change to celsius | | change_time_zone | time_zone | change time zone to mountain standard time | | check_messages | contact_name | check my messages
how many messages in my inbox | | climate_sync | hvac, left_side, right_side | turn on climate sync
turn off synchronization of the a c | | close_door | front, rear, driver_side, left_side, passenger_side, right_side, fuel_door, garage_door, side_door, van_door | back doors shut
close driver's side door
lower garage door | | close_glove_box | glove_box | shut the cubby-hole
close the jocky box
raise the glove compartment | | close_hood | hood, rear, trunk rear, hazard_lights | close my bonnet
slam the hood | | close_trunk | trunk | close the tailgate
lower the trunk
shut the boot | | close_window | front, rear, driver_side, left_side, passenger_side, right_side, roof, percentage_value, adjust_type, number_unit, length_unit adjust_type, number_unit, length_unit | close the sunroof
close the back passenger's side windows all the way
front window up | | connect_bluetooth | | bluetooth connect device car
connect cell phone
pair headphones | | current_speed | speed_unit | display speed
tell me what the speedometer says | | d_v_d_player | adjust_type, percentage_value, front, rear | play a d v d for the kids | | deactivate_car_alarm | | deactivate safety system
silence the car alarm | | deactivate_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, duration, percentage_value, trunk | all lights off
cabin light off
deactivate adaptive driving beam | | decline_call | contact_name | decline call from ylana
dismiss incoming call | | decrease_cruise_control | speed_unit, adjust_type, percentage_value, number_unit | decrease cruise control speed by eighty kilometeres per hour
decrease cruise forty
lower the speed of the cruise | | decrease_display | rear, front | dim the screen | | decrease_fan_speed | front, rear, driver_side, left_side, passenger_side, right_side, hvac, adjust_type, percentage_value, duration, number_unit, safety_lock duration, number_unit, safety_lock | change fan to slower
decrease fan some | | decrease_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, trunk | dim interior lights
turn down the dashboard lights | | decrease_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, adjust_type, percentage_value | decrease my seat warmer
make warmers cooler
set all of my seat warmers down | | decrease_steering_wheel_warmer | | | | decrease_temperature | adjust_type, percentage_value, front, rear, driver_side, passenger_side, hvac, number_unit, temperature_unit | decrease heat
decrease temperature by ten degrees celsius
it's hot in here | | decrease_volume | adjust_type, percentage_value, front, rear, number_unit front, rear, number_unit | crank audio level down
audio softer | | decrease_wiper_speed | front, rear, percentage_value, number_unit | decrease back windshield wiper speed
lower the front wiper by a little bit | | deny | | do not send
no | | feature_list | | what can i say
what features do you have again | | fuel_level | | check fuel level
do i need to get gas soon
how empty is the petrol tank | | give_duration | duration | five minutes
for two minutes | | give_name | contact_name | jeff
megan | | give_time | time | ten a m
three thirty p m | | good_bye | | see you later | | greet | | hello | | honk_horn | | beep the horn
honk | | increase_cruise_control | speed_unit, adjust_type, percentage_value, number_unit | bump up ths speed by three miles per hour
increase cruise speed ten | | increase_display | rear, front | make the screen brighter
increase display brightness | | increase_fan_speed | front, rear, driver_side, left_side, passenger_side, right_side, hvac, adjust_type, percentage_value, duration, number_unit, safety_lock | crank up the fans
change fan speed to higher speed | | increase_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, trunk, driver_side, right_side | make the dome lights brighter | | increase_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, adjust_type, percentage_value | change seat warmer faster speed
increase all my seat warmer | | increase_steering_wheel_warmer | | | | increase_temperature | adjust_type, percentage_value, front, rear, driver_side, passenger_side, hvac, number_unit, temperature_unit, temperature_unit | crank up heater
i'm extremely cold
increase the a c by fifty percent
make it warmer | | increase_volume | adjust_type, percentage_value, front, rear, number_unit, artist | change the volume louder
crank up the music level | | increase_wiper_speed | front, rear, percentage_value, number_unit | all of my wipers higher | | lock_door | front, rear, driver_side, left_side, passenger_side, right_side, safety_lock, van, fuel, hood, trunk, etc. | activate child safety locks
doors lock up | | lock_window | front, rear, driver_side, left_side, passenger_side, right_side | activate window child safety lock | | lower_steering_wheel | adjust_type | move my steering wheel down | | manual_default | | display vehicle diagnostics
where is the driver's manual | | manual_garage | | how do i program the garage remote | | manual_set_memory | | can i save the passenger seat position
save the driver seat memory two | | manual_topic | help_feature | alarm help
help cameras
how do I adjust the headrest
manual page for clock settings
what is the recommended tire pressure for my car | | music_player | player_action, album_title, artist, genre, podcast, song_title | play ,,
find me some music by
listen to , | | navigation | navigation_location | how do I reach the nearest railway station
best way to the closest park | | no_command | *any* | *all unrecognized commands* | | odometer_reset | | first trip counter to zero
reset odometer | | odometer_total | | current mileage
how many miles on car | | odometer_trip | | display trip distance
what's the trip odometer | | oil_level | number_unit | check oil level
how much oil is left | | open_door | front, rear, driver_side, left_side, passenger_side, right_side, fuel_door, garage_door, roof, side_door, van_door, percentage_value, roof | front right door open
my sliding doors pop open
open up all of the doors | | open_glove_box | glove_box | lower the glove compartment
open the cubby hole | | open_hood | hood, hazard_lights | bonnet open
please open the hood | | open_trunk | trunk, rear | boot open
my tailgate door pop open
open the barn doors | | open_window | front, rear, driver_side, left_side, passenger_side, right_side, roof, percentage_value, adjust_type, number_unit, length_unit adjust_type, number_unit, length_unit | roll down my windows
crack the moonroof open
lower windows by twenty five percent | | query_airbag | driver_side, passenger_side, right_side, left_side | is the passenger airbag engaged | | query_blinker | right_side, left_side, driver_side, passenger_side | are my blinkers on
turn signal status | | query_car_mode | car_mode | car mode status
what's the car mode | | query_cruise_control | speed_unit | what is the cruise set at
what speed is the cruise on | | query_date | date | what is today's date | | query_defrost | front, rear | is front defrost on
what is the status of the rear defrost | | query_door_lock | front, rear, driver_side, left_side, passenger_side, right_side, roof, safety_lock, fuel_door, garage_door, van_door, trunk, hood, glove_box | are the back doors locked
did i lock the car
is the child-safety lock on | | query_door_open | front, rear, driver_side, left_side, passenger_side, right_side, roof, safety_lock, fuel_door, garage_door, van_door, trunk, hood, glove_box | is the fuel door open
is the hood closed
did i remember to shut the garage door
which doors are open | | query_fan | fan_direction | is the fan set to footwell
where is the air blowing | | query_fan | front, rear | are the fans on
what is the fan speed | | query_lane_assist | | is the lane keeping aid active | | query_light | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, duration, percentage_value, trunk | did i leave the low beams on
is the dome light on
what is the status of the daytime running lights | | query_park_assist | | is the parking assistant on | | query_parking_brake | | is the parking brake engaged | | query_radio_station | | what channel is this | | query_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, adjust_type | are the heated seats on
is the front right heated seat on | | query_speed_limit | | what's the speed limit here | | query_steer_assist | | is the steering assistang engaged | | query_temperature | driver_side, passenger_side, hvac, percentage_value | check how cold is it inside
check temp in the car
tell me the current temperature of the car | | query_temperature_outside | date | check exterior temperature today for me
how hot is it outside my car | | query_time | time_zone | what time is it in berlin
what's the time now | | query_timer | | how much left on the timer | | query_volume | | how loud is the radio
what is the volume | | query_warning_light | | check dashboard lights
describe dashboard warning
tell me what is that light | | query_weekday | date | what day is it today
what's today | | query_window | front, rear, driver_side, left_side, passenger_side, right_side, roof, percentage_value, adjust_type, number_unit, length_unit | is the back right window open
are any windows open | | query_wipers | front, rear | are the wipers on
what is the speed of the wipers | | query_year | | what year is it | | raise_steering_whell | | move my steering wheel closer
raise steering wheel up a bit | | range | | distance until empty
how far before in need to refuel | | reset_demo | | reset the demo
start over | | scan_radio | | scan the fm radio
search for radio stations | | seat_backwards | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | adjust driver seat back
move driver seat backwards | | seat_cooling | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | activate seat cooling
increase seat ventilation | | seat_custom | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | adjust seat to preset one
disable passenger seat suspension
switch my driver's seat to saved position four | | seat_forward | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | move driver seat forward | | seat_incline | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | move front seat back up
sit up straighter | | seat_lower | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | lower my seat two inches | | seat_raise | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | lift up my seat
make my seat taller | | seat_recline | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | lay the seat back
recline the passenger seatback | | send_message | contact_name, message | message matt i'm on my way
ask batty by text are you free for dinner
compose note to john in gchat
dm whitney | | service_reminder | duration, time, date, number_unit, percentage_value, time_of_day | am i scheduled for an oil change soon
begin charge at two a m
schedule an appointment for a tune up | | set_car_mode | car_mode | activate traction mode
enable sports mode | | set_cruise_control | speed_unit, adjust_type, percentage_value, number_unit | cruise control set to seventy eight
set cruise control feature down to fifty five | | set_display | rear, front | adjust the screen settings | | set_fan | cabin_vent, dual_air, floor_vent, windshield_vent, driver_side, percentage_value | blow are on my feet
direct air flow to the windshield | | set_fan | front, rear, driver_side, left_side, passenger_side, right_side, hvac, adjust_type, percentage_value, duration, number_unit, safety_lock duration, number_unit, safety_lock | activate max a c
adjust fan speed to low setting
change the fan to four | | set_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, percentage_value, driver_side, left_side, passenger_side, right_side, trunk | chage cab light to max intensity | | set_off_car_mode | car_mode | cancel hill start assist
cut off auto pilot | | set_radio | radio_station, genre | change station to w i p b
listen to radio station a m ten seventy | | set_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, adjust_type, percentage_value | both seat warmers to three
can you turn my seat heater on medium
change warmers highest speed | | set_steering_wheel_warmer | | | | set_temperature | adjust_type, percentage_value, front, rear, driver_side, passenger_side, hvac, number_unit, temperature_unit | a c at seventy two degrees
change front heater setting to low
make the temperature sixty two degrees fahrenheit | | set_time | | reset the clock | | set_volume | adjust_type, artist, music_service, song_title | change audio volume zero
mute music | | set_volume | adjust_type, percentage_value, front, rear, number_unit, front, rear, number_unit | adjust the volume
change audio to fifty percent
make volume level three | | set_wipers | front, rear, percentage_value, number_unit | all the wipers high
my rear wiper highest | | speed_limiter | | activate speed limiter
decrease the governor
increase limiter to forty five miles per hour | | timer_on | duration | set a timer for ten minutes | | timer_off | | cancel timer | | tire_pressure | front, rear, driver_side, left_side, passenger_side, right_side | are my tires flat
what is my front right tire pressure | | turn_off_airbag | driver_side, passenger_side, left_side, right_side | activate passenger side airbag | | turn_off_blinker | side | blinkers off
deactivate left turn signal | | turn_off_bluetooth | | bluetooth disconnect car device
delete my bluetooth pairing for android | | turn_off_car | | cut the engine
disable vehicle | | turn_off_cruise_control | | cancel cruise control | | turn_off_defrost | front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, adjust_type | all my back glass defroster off
stop defroster | | turn_off_display | rear, front | turn off the screens | | turn_off_fan | front, rear, hvac, adjust_type, percentage_value | a c off
can you turn off the heat
cut fan | | turn_off_lane_assist | | turn off lane assist
deactivate l d w | | turn_off_navigation | | quit g p s
stop navigation | | turn_off_park_assist | | turn off parking assistant | | turn_off_parking_brake | roof | release e brake | | turn_off_radio | | turn off radio | | turn_off_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side | all seat warmers off
disengage right seat's seat warmers | | turn_off_steer_assist | | disengage steering assistant | | turn_off_steering_wheel_warmer | | | | turn_off_wipers | front, rear | back wipers off
deactivate all wipers | | turn_on_airbag | driver_side, passenger_side, left_side, right_side | | | turn_on_blinker | side | left blinker on
turn on the right blinker | | turn_on_bluetooth | | engage bluetooth
set up bluetooth | | turn_on_car | | begin engine
start the car | | turn_on_cruise_control | | begin cruise control feature
enable cruise contol | | turn_on_defrost | front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, adjust_type | activate front defrost only
demist rear windshield | | turn_on_display | rear, front | turn on the heads up display
can you turn on the rear tv screens
turn on the kids screens | | turn_on_fan | front, rear, hvac | activate a c
initiate heater
power on fan | | turn_on_lane_assist | | activate l t a
turn on lane keeping aid | | turn_on_navigation | | begin navigation
start the g p s | | turn_on_park_assist | | park the car
turn on parking sensors | | turn_on_parking_brake | | activate parking brake
engage hand brake | | turn_on_radio | | turn on the radio | | turn_on_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side | activate front left seat's seat heaters
both seat heaters on | | turn_on_steer_assist | | begin steer control | | turn_on_steering_wheel_warmer | | | | turn_on_wipers | front, rear | activate front wiper
turn on rear wiper | | unlock_door | front, rear, driver_side, left_side, passenger_side, right_side, roof, safety_lock, fuel_door, garage_door, van_door, trunk, hood, glove_box | all door child locks disable
driver side door unlock
| | unlock_window | front, rear, driver_side, left_side, passenger_side, right_side | all my windows child lock off | | warning_light._specific | warning_light | check engine light
provide information regarding blinking check engine light on dashboard
what does the green oil light indicate | | wash_window | front, rear | clean all back glass
mist my front windshield | | washer_fluid_level | | check washer fluid level | ## Templates Templates add functionality to recognizer models. This includes running models simultaneously or sequentially, and adding VAD audio gating. See [template types](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) for an overview. ### tpl-spot-concurrent-1.5.0.snsr Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models at the same time. **Also see these related items:** [tpl-spot-concurrent](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-concurrent.md#tpl-spot-concurrent-type) ### tpl-spot-debug-1.5.1.snsr Adds runtime data collection to a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) model. **Also see these related items:** [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug.md#tpl-spot-debug-type) ### tpl-spot-select-1.4.0.snsr Dynamically selects which of the two embedded [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models to run. **Also see these related items:** [tpl-spot-select](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select.md#tpl-spot-select-type) ### tpl-spot-sequential-1.5.0.snsr Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models one after the other, with optional looping on the second. Includes push-to-talk as an alternative to the wake word. **Also see these related items:** [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type) ### tpl-spot-vad-3.13.0.snsr Runs a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) until it spots, then does start- and endpoint detection on the subsequent audio stream using a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type). **Also see these related items:** [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type) ### tpl-opt-spot-vad-lvcsr-1.28.0.snsr _(TrulyNatural only)_ Optionally runs a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) until it spots, segments the subsequent audio stream with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), then sends the segmented audio to an [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer. You can select at runtime whether recognition waits on the wake word or starts immediately. **Also see these related items:** [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) ### tpl-spot-vad-lvcsr-3.23.0.snsr _(TrulyNatural only)_ Runs a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) until it spots, segments the subsequent audio stream with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), then sends the segmented audio to an [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer. **Also see these related items:** [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type) ### tpl-vad-lvcsr-3.17.0.snsr Detects speech with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to an [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer. **Also see these related items:** [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type) [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" *[API]: Application Programming Interface *[EFT]: Enrolled Fixed Trigger: fixed wake words adapted to a speaker to improve accuracy *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[OSS]: Open-source software *[PNC]: Punctuation and Capitalization, an STT model variant that emits cased text with punctuation *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets *[VAD]: Voice Activity Detector --- source_path: "models/tpl/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/" --- # Templates Task templates are models that use composition to add behavior to [basic model types](https://doc.sensory.com/tnl/7.8/models/types/index.md#model-types). Templates have _slots_ that you can fill with any model that has a [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) that matches what the slot expects. The [tpl-spot-vad-lvscr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type) template, for example, waits for the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), then runs the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) model in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). The composed model has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) `==` [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) and implements all of the events and settings expected of such a model type. You can use it as a drop-in replacement for a wake word in (say) [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spotc) without any code changes. Compose new template-based models with [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit), on the fly with [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval), or by using the [setStream](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) function at runtime. ## Composed models [tpl-spot-concurrent](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-concurrent.md#tpl-spot-concurrent-type) - Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models at the same time. It provides a convenient way to create a single wake word model that has the combined vocabulary of two other models. [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug.md#tpl-spot-debug-type) - Adds runtime data collection to a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) model. Use this to collect audio and event timings from an embedded model, [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) to extract audio, event logs, and the model itself from the generated log file, and [audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check) to verify audio recording quality. [tpl-spot-select](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select.md#tpl-spot-select-type) - Allows you to dynamically select which of the two embedded [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models to run. [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type) - Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models in sequence. Use this to listen for a trigger phrase followed by a command, for example: "Voice genie, play music." [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type) - Runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, then does start- and endpoint detection with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) on the audio stream following the wake word. [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) _(TrulyNatural only)_ - _Optionally_ runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, then segments the audio following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type) _(TrulyNatural only)_ - Runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, segments the audio following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type) _(TrulyNatural only)_ - Detects speech with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0). *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/tpl/tpl-opt-spot-vad-lvcsr.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr/" --- # tpl-opt-spot-vad-lvcsr _(TrulyNatural only)_ This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) _optionally_ runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, then segments the audio following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) controls whether `tpl-opt-spot-vad-lvcsr` waits for the wake word: * With `slot == 0` it waits for the wake word before starting the VAD. In this mode the behavior is that of [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type). * With `slot == 1` starts the VAD immediately and the behavior is that of [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type). You can change [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) at runtime. Use this to gate only the first of a series of commands with a wake word. `tpl-spot-vad-lvcsr` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) * **Slot 1:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) **Also see these related items:** [tpl-opt-spot-vad-lvcsr-1.28.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr), [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type), [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type) ## Operation ```mermaid flowchart TD start((start)) slotCheck0{slot == 0?} start --> slotCheck0 slotCheck0 -->|yes| startWW slotCheck0 -->|no| fetch0 subgraph slot0[slot 0 (phrasespot)] startWW((start)) fetchWW[/samples from ->audio-pcm/] audioWW(^sample-count) processWW[process] result(0.^result) stopWW((stop)) startWW --> fetchWW fetchWW --> audioWW audioWW --> processWW processWW --> fetchWW processWW -->|recognize| result result --> stopWW end subgraph slot1[slot 1 (lvcsr)] startSTT((start)) startSTTfinal((start)) stopSTT((stop)) stopSTTpartial((stop)) processSTT[process] partialSTT(^result-partial) intentSTT(^nlu-intent) slotSTT(^nlu-slot) resultSTT(^result) nluSTT{NLU
match?} slmSTT{SLM
included?} generateSTT[generate] slmstartSTT(^slm-start) slmresultpartialSTT(^slm-result-partial) slmresultSTT(^slm-result) startSTT --> processSTT processSTT ---->|hypothesis| partialSTT partialSTT --> stopSTTpartial startSTTfinal --> nluSTT nluSTT -->|yes| intentSTT nluSTT -->|no| resultSTT intentSTT --> slotSTT slotSTT --> resultSTT slotSTT -->|more| intentSTT resultSTT --> slmSTT slmSTT -->|yes| slmstartSTT slmSTT -->|no| stopSTT slmstartSTT -->|OK| generateSTT slmstartSTT -->|STOP| stopSTT generateSTT -->|response| slmresultpartialSTT slmresultpartialSTT --> generateSTT generateSTT -->|done| slmresultSTT slmresultSTT --> stopSTT end listenBegin(^listen-begin) listenEnd(^listen-end) stopWW --> listenBegin listenBegin --> fetch0 fetch0[/samples from ->audio-pcm/] fetch1[/samples from ->audio-pcm/] audio0(^sample-count) audio1(^sample-count) silence(^silence) begin(^begin) END(^end) limit(^limit) process0[VAD process] process1[VAD process] final@{ shape: f-circ } slotCheck1{slot == 0?} fetch0 --> audio0 audio0 --> process0 process0 --> fetch0 process0 -->|speech start| begin process0 -->|timeout| silence silence ~~~ final silence --> slotCheck1 begin --> fetch1 fetch1 --> audio1 audio1 --> process1 process1 --> startSTT stopSTTpartial --> fetch1 process1 -->|speech end| END process1 -->|speech limit| limit END --> final limit --> final final --> startSTTfinal stopSTT --> slotCheck1 slotCheck1 -->|no| fetch0 slotCheck1 -->|yes| listenEnd listenEnd --> startWW ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. If processing does not detect a wake word, continue at step 1. 4. Invoke [0.^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) for the wake word. 5. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) and start VAD processing. 6. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 7. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 8. If VAD processing does not detect the start of speech within the [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) timeout, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and continue at step 15. 9. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin) if processing detects the start of speech, else continue at step 6. 10. Read audio date from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 11. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 12. If VAD processing detects an endpoint invoke either [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) or [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and continue at step 14. 13. Process VAD segmented audio in the LVCSR or STT recognizer * Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition result hypothesis. * Continue at step 10. 14. Produce a final LVCSR or STT recognition hypothesis. * Invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each NLU intent found. * Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis. * If there's no SLM, continue at step 15. * Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), if the callback returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), continue at step 15. * Generate SLM result, invoking [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) on each generated token. * Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) with complete SLM result. 15. Invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and start listening for the wake word again at step 1. Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backlog-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backlog-interval), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end) **Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code) ## Notes Use this template for command and control type applications where commands are initiated with a wake word in certain contexts and not in others. Set [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot)`= 1` in the [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) handler, and [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot)`= 0` in the [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) handler. With this configuration the recognizer requires a wake word to start listening only for the first in a series of interactions. After this it will revert to requiring a wake word only if the user does not say anything for at least [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms. VAD settings [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), and [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence) apply to both slot 0 and slot 1, but [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence) applies only to slot 0. Set [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio)` = 1` to include the wake word audio in the samples passed to the LVCSR or STT recognizer. STT hypotheses do not include the wake word text unless Sensory specifically configured the model to do so. The [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) and [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) events are for the LVCSR or STT recognizer in slot 1. If you need direct access to the wake word result, prefix the event with the slot path: `0.^result` Use the slot prefix to read values in the `0.^result` event handler too, for example call [getString](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) with key [0.text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) to read the wake word transcription. ## Examples ### Select wake-word or VAD-only behavior ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o opt-vg-stt.snsr\ -t model/tpl-opt-spot-vad-lvcsr-1.28.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr\ -s include-wake-word-audio=1 # Say "Voice genie, open the sunroof." % snsr-eval -vt opt-vg-stt.snsr Using live audio from default capture device. ^C to stop. P 33010 33490 (0.3201) Open the sun P 33050 33890 (0.7712) Open the sunroof 32010 34185 [^end] VAD speech region. NLU intent: open_window (0.9956) = open the sunroof NLU entity: roof (0.9595) = sunroof 33050 33890 (0.5731) Open the sunroof. ^C # Select the VAD-only path with slot=1 # Say "Close all the windows" % snsr-eval -vt opt-vg-stt.snsr -s slot=1 Using live audio from default capture device. ^C to stop. P 2150 2670 (0.257) Clothes. All P 2190 3150 (0.7631) Close. All the wind P 2190 3430 (0.9899) Close all the windows 1950 3855 [^end] VAD speech region. NLU intent: close_window (0.9977) = close all the windows 2190 3470 (0.9244) Close all the windows. ^C ``` ### Use trailing wake-word Recognize a phrase with the wake word at either end of an utterance. ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o opt-vg-stt-vg.snsr\ -t model/tpl-opt-spot-vad-lvcsr-1.28.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr\ -s include-wake-word-audio=1\ -s wake-word-at-end=1 # Say "Voice genie, set the radio to 91.5 FM." % bin/snsr-eval -vt opt-vg-stt-vg.snsr Using live audio from default capture device. ^C to stop. P 4360 5000 (0.2927) Set. The radio P 4400 5280 (5.7e-07) Set the radio to n P 4400 5760 (0.7336) Set the radio to ninety-one P 4400 6120 (0.6005) Set the radio to ninety one point P 4400 6440 (0.5195) Set the radio to ninety one point. Five P 4400 6480 (0.6733) Set the radio to ninety one point. Five 3405 7455 [^end] VAD speech region. NLU intent: set_radio (0.9674) = set the radio to 91.5 FM NLU entity: radio_station (0.9688) = 91.5 FM 4400 7080 (0.3896) Set the radio to ninety one point. Five F. M. 15225 17490 [^end] VAD speech region. # Say "Will it rain in Portland tomorrow, Voice Genie?" NLU intent: no_command (0.9977) = will it rain in portland tomorrow NLU entity: time (0.9773) = tomorrow 15460 17260 (0.6731) Will it rain in Portland tomorrow? ^C ``` *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/tpl/tpl-spot-concurrent.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-concurrent/" --- # tpl-spot-concurrent This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models at the same time. It provides a convenient way to create a single wake word model that has the combined vocabulary of two other models. `tpl-spot-concurrent` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) * **Slot 1:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [tpl-spot-concurrent-1.5.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-concurrent) ## Operation ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) split@{ shape: f-circ } join@{ shape: f-circ } start --> fetch fetch --> audio audio --> split split --> start0 split --> start1 end0 --> join end1 --> join join ----> fetch subgraph slot0[slot 0 (phrasespot)] start0((start)) process0[process] result0(^result) end0((stop)) start0 --> process0 process0 --> end0 process0 -->|recognize| result0 result0 --> end0 end subgraph slot1[slot 1 (phrasespot)] start1((start)) process1[process] result1(^result) end1((stop)) start1 --> process1 process1 --> end1 process1 -->|recognize| result1 result1 --> end1 end ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Send audio samples to recognizers in slot 0 and slot 1. 4. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase. 5. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** _none_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to) **Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second) **Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) ## Notes Runs the wake word models in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) and slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1) at the same time, in the same thread. The two recognizers are entirely independent, and can produce results that overlap in time. For production use Sensory recommends custom multi-phrase wake word recognizers instead. These have improved false reject / false accept performance. The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), and can be used in any application that expects such a model without API changes. Configuration settings and iterators are not available in the combined model. You can access these for the individual models by prefixing the setting path with the slot. For example, use `0.operating-point` to read or change the [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) of the first spotter and use `1.operating-point` to read or change the [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) of the second spotter. Attempting to set [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) without a slot prefix will result in an error. ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vg-hbg.snsr\ -t model/tpl-spot-concurrent-1.5.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/spot-hbg-enUS-1.4.0-m.snsr % bin/snsr-edit \ -t vg-hbg.snsr \ -s operating-point=5 Setting "operating-point" not found, did you mean "0.operating-point" or "1.operating-point"? ``` Change individual settings at runtime by prefixing the setting name with the slot: **C/C++** ```c /* Set the operating point for spotter 0 only. */ snsrSetInt(session, SNSR_SLOT_0 SNSR_OPERATING_POINT, 7); ``` **Java** ```java /* Set the operating point for spotter 0 only. */ session.setInt(Snsr.SLOT_0 + Snsr.OPERATING_POINT, 7); ``` **Python** ```python # Set the operating point for spotter 0 only. session.set_int(snsr.SLOT_0 + snsr.OPERATING_POINT, 7) ``` You can recombine combined models and nest them to an arbitrary depth to run any number[^1] of wake word recognizers at the same time: [^1]: Limited only by available RAM and CPU. ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o models1-4.snsr\ -t model/tpl-spot-concurrent-1.5.0.snsr\ -f 0 model/tpl-spot-concurrent-1.5.0.snsr\ -f 0.0 model-1.snsr\ -f 0.1 model-2.snsr\ -f 1 model/tpl-spot-concurrent-1.5.0.snsr\ -f 1.0 model-1.snsr\ -f 1.1 model-2.snsr ``` In this example, the four wake word models are located in the `0.0`, `0.1`, `1.0` and `1.1` slots ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vg-hbg.snsr\ -t model/tpl-spot-concurrent-1.5.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/spot-hbg-enUS-1.4.0-m.snsr % bin/snsr-eval -t vg-hbg.snsr 2370 2940 voicegenie 5805 6420 voicegenie 7740 8640 hello blue genie 10440 11100 voicegenie 12060 12870 hello blue genie ^C ``` *[API]: Application Programming Interface *[RAM]: Random Access Memory *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "models/tpl/tpl-spot-debug.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug/" --- # tpl-spot-debug This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) adds runtime data collection to a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) model. Use this to collect audio and event timings from an embedded model, [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) to extract audio, event logs, and the model itself from the generated log file, and [audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check) to verify audio recording quality. `tpl-spot-debug` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [tpl-spot-debug-1.5.1.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-debug) ## Operation ```mermaid flowchart TD start0((start)) log@{ shape: doc, label: "debug-log-file" } start0 --> start slot0 -.-> log subgraph slot0[slot 0 (phrasespot)] start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) process result(^result) start --> fetch fetch --> audio audio --> process process --> fetch process -->|recognize| result result --> fetch end ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase. 4. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream) **Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [debug-log-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#debug-log-file), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [include-model](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-model), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold) **Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) ## Notes You must specify the name of the log file with [debug-log-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#debug-log-file). [include-model](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-model) controls whether the log file includes a copy of the original task model. This is enabled by default. The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) that you can use as a drop-in replacement for the original wake word without any API changes. Log files include time-stamped entries with: * SDK library information, * the spotter model being used, * audio samples, and * event callbacks. Extract text, model and audio data from the log file with the [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) utility. ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o hbg-debug.snsr\ -t model/tpl-spot-debug-1.5.1.snsr\ -f 0 model/spot-hbg-enUS-1.4.0-m.snsr\ -s debug-log-file=hbg-debug.snsrlog % bin/snsr-eval -t hbg-debug.snsr 2925 3690 hello blue genie 4995 5790 hello blue genie 7920 8640 hello blue genie ^C # The error below is harmless and expected when you # interrupt snsr-eval with ^C % bin/snsr-log-split -vv hbg-debug.snsrlog Writing to './' Processing hbg-debug.snsrlog -> audio ./hbg-debug.wav -> event ./hbg-debug.txt -> model ./hbg-debug.snsr Error: Input file "hbg-debug.snsrlog" is truncated. Processed 1273 items. ``` *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "models/tpl/tpl-spot-select.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select/" --- # tpl-spot-select This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) allows you to dynamically select which of the two embedded [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models to run. `tpl-spot-select` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) * **Slot 1:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [tpl-spot-select-1.4.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-select) ## Operation ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) join@{ shape: f-circ } start --> fetch fetch --> audio audio -->|slot == 0| start0 audio -->|slot == 1| start1 end0 --> join end1 --> join join ----> fetch subgraph slot0[slot 0 (phrasespot)] start0((start)) process0[process] result0(^result) end0((stop)) start0 --> process0 process0 --> end0 process0 -->|recognize| result0 result0 --> end0 end subgraph slot1[slot 1 (phrasespot)] start1((start)) process1[process] result1(^result) end1((stop)) start1 --> process1 process1 --> end1 process1 -->|recognize| result1 result1 --> end1 end ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Send audio to the recognizer specified by [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot). 5. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase. 6. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^adapt-started](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapt-started), [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** _none_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user) **Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) **Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) ## Notes Use [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) to select either the spotter in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) or slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). Use this template to reduce the model size when an application uses variants of the same recognizer in different contexts. This reduces the overall model size and RAM requirements as identical objects are shared between the slots. The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), and can be used in any application that expects such a model without API changes. Configuration settings and iterators are not available in the combined model. You can access these for the individual models by prefixing the setting path with the slot. For example, use `1.operating-point` to read or change the [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) of the second spotter. Change individual settings at runtime by prefixing the setting name with the slot: **C/C++** ```c /* Set the operating point for spotter 0 only. */ snsrSetInt(session, SNSR_SLOT_0 SNSR_OPERATING_POINT, 7); ``` **Java** ```java /* Set the operating point for spotter 0 only. */ session.setInt(Snsr.SLOT_0 + Snsr.OPERATING_POINT, 7); ``` **Python** ```python # Set the operating point for spotter 0 only. session.set_int(snsr.SLOT_0 + snsr.OPERATING_POINT, 7) ``` ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vg-hbg-select.snsr\ -t model/tpl-spot-select-1.4.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/spot-hbg-enUS-1.4.0-m.snsr # repeat "hello blue genie" and "voice genie" % bin/snsr-eval -t vg-hbg-select.snsr -s slot=0 3480 4140 voicegenie 9945 10545 voicegenie ^C # repeat "hello blue genie" and "voice genie" % bin/snsr-eval -t vg-hbg-select.snsr -s slot=1 1635 2460 hello blue genie 6210 6870 hello blue genie ^C ``` *[API]: Application Programming Interface *[RAM]: Random Access Memory *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "models/tpl/tpl-spot-sequential.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential/" --- # tpl-spot-sequential This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models in sequence. Use this to listen for a trigger phrase followed by a command, for example: "Voice genie, play music." `tpl-spot-sequential` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) * **Slot 1:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [tpl-spot-sequential-1.5.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-concurrent) ## Operation ```mermaid flowchart TD start((start)) loop0{loop == 2?} start --> loop0 loop0 -->|no| start0 loop0 -->|yes| start1 subgraph slot0[slot 0 (phrasespot)] start0((start)) fetch0[/samples from ->audio-pcm/] audio0(^sample-count) process0[process] stop0((stop)) start0 --> fetch0 fetch0 --> audio0 audio0 --> process0 process0 --> fetch0 process0 -->|recognize| stop0 end listenBegin(^listen-begin) stop0 --> listenBegin listenBegin --> start1 subgraph slot1[slot 1 (phrasespot)] start1((start)) fetch1[/samples from ->audio-pcm/] audio1(^sample-count) process1[process] result1(^result) stop1((stop)) loop{loop == 0?} loop2{loop == 2?} start1 --> fetch1 fetch1 --> audio1 audio1 --> process1 process1 --> fetch1 process1 --->|recognize| result1 process1 -->|timeout| loop2 loop2 -->|no| stop1 loop2 -->|yes| fetch1 result1 --> loop loop -->|no| fetch1 loop -->|yes| stop1 end listenEnd(^listen-end) stop1 --> listenEnd listenEnd --> start0 ``` Operation flow. 1. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 2` skip to step 6. 2. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 3. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 4. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase, else continue at step 2. 5. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), then start the wake word in slot 1. 6. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 7. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 8. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `!= 2` and processing does not detect a wake word within [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and restart the slot 0 wake word at step 2. 9. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase, else continue at step 6. 10. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 0` invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and continue at step 2. 11. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `!= 0` reset the listen-window timeout and continue processing at step 6. 12. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream) **Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold) **Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) ## Notes With [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 0` (the default): This template runs the spotter in slot `0` until it spots, then runs slot `1` until it spots, or the [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window) timeout expires, then returns to the spotter in slot `0`. With [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 1`: This runs the spotter in slot `0` until it spots, then runs slot `1` until the [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window) timeout expires, then returns to the spotter in slot `0`. It resets the expiration timer every time slot `1` recognizes. With [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 2`: The template runs only slot `1`. If your application needs to listen for a wake word but also support an external trigger, such as a push-to-talk button, set `loop=2` when the event occurs. The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) and can be used in any application that expects those without code changes. Combined model settings refer to the model in slot `1`, so [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) refers to `1.operating-point`. You can change settings for the wake word in slot `0` by prefixing the setting name with [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), for example: `0.operating-point`. The model invokes [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) just before audio focus switches to slot 1, and [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) before audio focus switches back to slot 0. If there's no [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) between `^listen-begin` and `^listen-end` it is because the recognizer in slot 1 timed out. ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vg-music.snsr\ -t model/tpl-spot-sequential-1.5.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/spot-music-enUS-1.2.0-m.snsr # say "voice genie, play music" % bin/snsr-eval -vvt vg-music.snsr Using live audio from default capture device. ^C to stop. Using operating point 17. Available operating points: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20. Available vocabulary: 1: "play_music" 2: "previous_song" 3: "stop_music" 4: "next_song" 5: "pause_music" 3180 [^listen-begin] phrase: 3630 4410 (1 sv) play_music words: 3630 3900 (1 sv) 3900 4410 (1 sv) play_music 4635 [^listen-end] ^C ``` *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "models/tpl/tpl-spot-vad-lvcsr.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr/" --- # tpl-spot-vad-lvcsr _(TrulyNatural only)_ This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, segments the audio following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1). This behavior is also available in the [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) template, which adds an option to skip the wake word. `tpl-spot-vad-lvcsr` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) * **Slot 1:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) **Also see these related items:** [tpl-spot-vad-lvcsr-3.23.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) ## Operation ```mermaid flowchart TD start((start)) start --> startWW subgraph slot0[slot 0 (phrasespot)] startWW((start)) fetchWW[/samples from ->audio-pcm/] audioWW(^sample-count) processWW[process] result(0.^result) stopWW((stop)) startWW --> fetchWW fetchWW --> audioWW audioWW --> processWW processWW --> fetchWW processWW -->|recognize| result result --> stopWW end subgraph slot1[slot 1 (lvcsr)] startSTT((start)) startSTTfinal((start)) stopSTT((stop)) stopSTTpartial((stop)) processSTT[process] partialSTT(^result-partial) intentSTT(^nlu-intent) slotSTT(^nlu-slot) resultSTT(^result) nluSTT{NLU
match?} slmSTT{SLM
included?} generateSTT[generate] slmstartSTT(^slm-start) slmresultpartialSTT(^slm-result-partial) slmresultSTT(^slm-result) startSTT --> processSTT processSTT ---->|hypothesis| partialSTT partialSTT --> stopSTTpartial startSTTfinal --> nluSTT nluSTT -->|yes| intentSTT nluSTT -->|no| resultSTT intentSTT --> slotSTT slotSTT --> resultSTT slotSTT -->|more| intentSTT resultSTT --> slmSTT slmSTT -->|yes| slmstartSTT slmSTT -->|no| stopSTT slmstartSTT -->|OK| generateSTT slmstartSTT -->|STOP| stopSTT generateSTT -->|response| slmresultpartialSTT slmresultpartialSTT --> generateSTT generateSTT -->|done| slmresultSTT slmresultSTT --> stopSTT end listenBegin(^listen-begin) listenEnd(^listen-end) stopWW --> listenBegin listenBegin --> fetch0 fetch0[/samples from ->audio-pcm/] fetch1[/samples from ->audio-pcm/] audio0(^sample-count) audio1(^sample-count) silence(^silence) begin(^begin) END(^end) limit(^limit) process0[VAD process] process1[VAD process] final@{ shape: f-circ } fetch0 --> audio0 audio0 --> process0 process0 --> fetch0 process0 -->|speech start| begin process0 -->|timeout| silence silence ~~~ final silence --> listenEnd begin --> fetch1 fetch1 --> audio1 audio1 --> process1 process1 --> startSTT stopSTTpartial --> fetch1 process1 -->|speech end| END process1 -->|speech limit| limit END --> final limit --> final final --> startSTTfinal stopSTT --> listenEnd listenEnd --> startWW ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. If processing does not detect a wake word, continue at step 1. 4. Invoke [0.^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) for the wake word. 5. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) and start VAD processing. 6. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 7. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 8. If VAD processing does not detect the start of speech within the [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) timeout, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and continue at step 15. 9. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin) if processing detects the start of speech, else continue at step 6. 10. Read audio date from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 11. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 12. If VAD processing detects an endpoint invoke either [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) or [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and continue at step 14. 13. Process VAD segmented audio in the LVCSR or STT recognizer * Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition result hypothesis. * Continue at step 10. 14. Produce a final LVCSR or STT recognition hypothesis. * Invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each NLU intent found. * Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis. * If there's no SLM, continue at step 15. * Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), if the callback returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), continue at step 15. * Generate SLM result, invoking [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) on each generated token. * Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) with complete SLM result. 15. Invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and start listening for the wake word again at step 1. Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backlog-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backlog-interval), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end) **Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code) ## Notes Use this template for command and control type applications where commands are initiated with a wake word. The [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) and [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) events are for the LVCSR or STT recognizer in slot 1. If you need direct access to the wake word result, prefix the event with the slot path: `0.^result` Use the slot prefix to read values in the `0.^result` event handler too, for example call [getString](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) with key [0.text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) to read the wake word transcription. Set [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio)` = 1` to include the wake word audio in the samples passed to the LVCSR or STT recognizer. STT hypotheses do not include the wake word text unless Sensory specifically configured the model to do so. ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vg-stt.snsr\ -t model/tpl-spot-vad-lvcsr-3.23.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -f 1 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr\ -s include-wake-word-audio=1 # Say "Voice genie, open the sunroof." % snsr-eval -vt vg-stt.snsr Using live audio from default capture device. ^C to stop. P 2770 3250 (0.4166) Open the sun P 2810 3650 (0.7161) Open the sunroof 1815 3990 [^end] VAD speech region. NLU intent: open_window (0.9956) = open the sunroof NLU entity: roof (0.9595) = sunroof 2810 3690 (0.4394) Open the sunroof. ^C ``` *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/tpl/tpl-spot-vad.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad/" --- # tpl-spot-vad This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, then does start- and endpoint detection with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) on the audio stream following the wake word. `tpl-spot-vad` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot-vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot-vad). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [tpl-spot-vad-3.13.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad) ## Operation ```mermaid flowchart TD start((start)) start --> startWW subgraph slot0[slot 0 (phrasespot)] startWW((start)) fetchWW[/samples from ->audio-pcm/] audioWW(^sample-count) processWW[process] result(^result) stopWW((stop)) startWW --> fetchWW fetchWW --> audioWW audioWW --> processWW processWW --> fetchWW processWW -->|recognize| result result --> stopWW end listenBegin(^listen-begin) listenEnd(^listen-end) stopWW --> listenBegin listenBegin --> fetch0 fetch0[/samples from ->audio-pcm/] fetch1[/samples from ->audio-pcm/] audio0(^sample-count) audio1(^sample-count) silence(^silence) begin(^begin) END(^end) limit(^limit) process0[process] process1[process] out[\samples to <-audio-pcm\] final@{ shape: f-circ } fetch0 --> audio0 audio0 --> process0 process0 --> fetch0 process0 -->|speech start| begin process0 -->|timeout| silence silence --> final begin --> fetch1 fetch1 --> audio1 audio1 --> out out --> process1 process1 --> fetch1 process1 -->|speech end| END process1 -->|speech limit| limit END --> final limit --> final final --> listenEnd listenEnd --> startWW ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. If processing detects a vocabulary phrase, skip to step 5. 4. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). 5. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) 6. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) 7. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 8. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 9. If speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms continue at step 12. 10. If _no_ speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and skip to step 19. 11. Continue processing at step 7 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end). 12. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin). 13. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 14. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 15. If [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `== 1` write speech samples to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out). 16. If end detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and skip to step 19. 17. If end _not_ detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) and skip to step 19. 18. Continue processing at step 13 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end). 19. Invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) 20. Restart at step 1. Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms), [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end) **Available values:** [phrasespot-vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot-vad) **Also see these related items:** [live-segment.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment.md#live-segment-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) ## Notes Use this for wake-word gated audio sent to cloud engines. Set [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio) `= 1` to include the wake word audio in the VAD audio output stream. This template writes the VAD-segmented audio to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out). If your application does not use this, set [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `= 0`. ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vg-vad.snsr\ -t model/tpl-spot-vad-3.13.0.snsr\ -f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\ -s include-wake-word-audio=1 # Say "Voice genie, what's the capital of Oregon?" % bin/snsr-eval -o vad-audio.wav -vvt vg-vad.snsr Using live audio from default capture device. ^C to stop. Using operating point 8. Available operating points: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21. Available vocabulary: 1: "voicegenie" phrase: 1950 2550 (1) voicegenie words: 1950 2550 (1) voicegenie 2730 [^listen-begin] 2730 [^begin] 1650 4200 [^end] VAD speech region. 4980 [^listen-end] ^C ``` Review _vad-audio.wav_: the recording starts [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff) ms before the the beginning of "voice genie" and continues until [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over) ms after the end of the utterance. *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/tpl/tpl-vad-lvcsr.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr/" --- # tpl-vad-lvcsr _(TrulyNatural only)_ This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) detects speech with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0). `tpl-vad-lvcsr` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot). Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type): * **Slot 0:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) **Also see these related items:** [tpl-vad-lvcsr-3.17.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) ## Operation ```mermaid flowchart TD start((start)) start --> fetch0 subgraph slot0[slot 0 (lvcsr)] startSTT((start)) startSTTfinal((start)) stopSTT((stop)) stopSTTpartial((stop)) processSTT[process] partialSTT(^result-partial) intentSTT(^nlu-intent) slotSTT(^nlu-slot) resultSTT(^result) nluSTT{NLU
match?} slmSTT{SLM
included?} generateSTT[generate] slmstartSTT(^slm-start) slmresultpartialSTT(^slm-result-partial) slmresultSTT(^slm-result) startSTT --> processSTT processSTT ---->|hypothesis| partialSTT partialSTT --> stopSTTpartial startSTTfinal --> nluSTT nluSTT -->|yes| intentSTT nluSTT -->|no| resultSTT intentSTT --> slotSTT slotSTT --> resultSTT slotSTT -->|more| intentSTT resultSTT --> slmSTT slmSTT -->|yes| slmstartSTT slmSTT -->|no| stopSTT slmstartSTT -->|OK| generateSTT slmstartSTT -->|STOP| stopSTT generateSTT -->|response| slmresultpartialSTT slmresultpartialSTT --> generateSTT generateSTT -->|done| slmresultSTT slmresultSTT --> stopSTT end fetch0[/samples from ->audio-pcm/] fetch1[/samples from ->audio-pcm/] audio0(^sample-count) audio1(^sample-count) silence(^silence) begin(^begin) END(^end) limit(^limit) process0[VAD process] process1[VAD process] final@{ shape: f-circ } listenEnd@{ shape: f-circ } fetch0 --> audio0 audio0 --> process0 process0 --> fetch0 process0 -->|speech start| begin process0 -->|timeout| silence silence ~~~ final silence --> listenEnd begin --> fetch1 fetch1 --> audio1 audio1 --> process1 process1 --> startSTT stopSTTpartial --> fetch1 process1 -->|speech end| END process1 -->|speech limit| limit END --> final limit --> final final --> startSTTfinal stopSTT --> listenEnd listenEnd ----> fetch0 ``` Operation flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. If VAD processing does not detect the start of speech within the [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) timeout, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and continue at step 1. 4. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin) if processing detects the start of speech, else continue at step 1. 5. Read audio date from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 6. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 7. If VAD processing detects an endpoint invoke either [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) or [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and continue at step 9. 8. Process VAD segmented audio in the LVCSR or STT recognizer * Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition result hypothesis. * Continue at step 5. 9. Produce a final LVCSR or STT recognition hypothesis. * Invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each NLU intent found. * Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis. * If there's no SLM, continue at step 1. * Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), if the callback returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), continue at step 1. * Generate SLM result, invoking [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) on each generated token. * Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) with complete SLM result. * Continue at step 1. Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start) **Available iterators:** _none_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile) **Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code) ## Notes Use this template for command and control type applications where commands are initiated just by speaking. ## Examples ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -o vad-stt.snsr\ -t model/tpl-vad-lvcsr-3.17.0.snsr\ -f 0 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr # Say, for example: "Turn the air conditioning up all the way" % snsr-eval -t vad-stt.snsr P 1000 1040 T P 1000 1600 Turn the egg P 1040 2040 Turn the air conditioner P 1040 2320 Turn the air conditioning up P 1040 2760 Turn the air conditioning up all the way NLU intent: set_fan (0.9547) = turn the air conditioning up 100% NLU entity: hvac (0.9744) = air conditioning NLU entity: percentage_value (0.8963) = 100% 1040 2880 Turn the air conditioning up all the way. ^C ``` *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/types/ca.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/ca/" --- # Adapting wake word These are fixed wake word models that continuously adapt to speakers' voices to improve false-accept rates. They are drop-in replacements for fixed wake words. Continuously adapting wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) and filenames that by convention match `ca-*.snsr` **Also see these related items:** [Adapting wake word models](https://doc.sensory.com/tnl/7.8/models/index.md#ca-models) included in this distribution. ## Operation ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) process result(^result) adaptStarted(^adapt-started) adapted(^adapted) newUser(^new-user) start --> fetch fetch --> audio audio --> process process --> fetch process -->|recognize| result process -->|recognize w/ high SNR| adaptStarted adaptStarted --> result result --> fetch fetch -->|adapted| adapted adapted --> fetch adapted -->|new user identified| newUser newUser --> fetch ``` Recognition flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Invoke [^adapt-started](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapt-started) if processing detects a vocabulary phrase in a low-noise environment. This starts adapting the model to the speaker's voice on a background thread. 4. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase. 5. Invoke [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) when the background thread has finished adding an enrollment. * Invoke [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user) if adaptation detects a user it hasn't seen before. 6. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^adapt-started](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapt-started), [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [cache-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#cache-file), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold) **Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "models/types/enroll.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/enroll/" --- # Wake word enrollment These models provide user enrollment for EFT and UDT. They produce [wake-word models](https://doc.sensory.com/tnl/7.8/models/index.md#wake-word-models). Enrollment models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll) and filenames that by convention match `eft-*.snsr` or `udt-*.snsr` **Also see these related items:** [wake word enrollment models](https://doc.sensory.com/tnl/7.8/models/index.md#enroll-models) included in this distribution. ## Operation Wake word enrollment has two modes: [interactive](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-interactive) for live recordings, and [offline](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-offline) for pre-recorded enrollment audio. ### Interactive With [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive) `= 1` enrollment tasks expect live audio and re-record enrollments that cannot be used. ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) segment[segment audio] check[check audio quality] validate[check enrollment consistency] resume(^resume) next(^next) pause(^pause) pass(^pass) fail0(^fail) fail1(^fail) enrolled(^enrolled) progress(^progress) adapted(^adapted) done(^done) zeroCount[count←0] incrCount[count++] start --> next next --> zeroCount zeroCount --> resume resume --> fetch fetch --> audio audio --> segment segment --> fetch segment -->|endpoint| pause pause --> check check -->|good| pass pass ---> incrCount incrCount --> resume pass -->|count == required| validate validate -->|good| next validate -->|bad| fail1 check -->|bad| fail0 fail0 --> resume fail1 --> zeroCount next -->|user == NULL| enrolled enrolled --> enroll enroll --> progress progress --> enroll enroll --->|complete| adapted adapted --> done done ~~~ validate ``` Interactive enrollment flow. 1. Invoke [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next). * If the callback set [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) to `NULL`, start model adaptation at step 8. 2. Reset the enrollment `count` to `0`. This tracks the number of usable enrollments for the current user or phrase. 3. Invoke [^resume](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#resume). Application should use this to restart audio recording. 4. Make an audio recording and segment it with a VAD (UDT) or a wake word (EFT). * Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). * Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). * Process and repeat until a speech segment is found. 5. Invoke [^pause](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pause). Application should pause audio recording. 6. Check enrollment audio quality. * If good, invoke [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass). If [req-enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#req-enroll) enrollments remain, start validation at step 7, else start the next recording at step 3. * If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) and redo the current recording at step 3. 7. Validate all the enrollment recordings, checking for consistency. * If good, start enrolling the next user or phrase at step 1. * If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) and restart at step 2. 8. When all users / phrases are available, invoke [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled). 9. Train a new recognizer with the enrollments * Call [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress) repeatedly until done. 10. Invoke [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted). 11. Invoke [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done). **Also see these related items:** [live-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-enroll.md#live-enroll-code), [enrollUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT.md#enrolludt-code), [Enroll.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-enroll) ### Offline With [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive) `= 0` enrollment tasks expect pre-recorded audio and fails if any of the enrollments cannot be used. ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) segment[segment audio] check[check audio quality] validate[check enrollment consistency] next(^next) pass(^pass) fail0(^fail) fail1(^fail) enrolled(^enrolled) progress(^progress) adapted(^adapted) done(^done) skip[discard enrollment] skipBad[discard bad enrollment] zeroCount[count←0] incrCount[count++] user0{user == NULL?} user1{user == NULL?} start --> user0 user0 -->|yes| next user0 -->|no| user1 next --> user1 user1 -->|no| zeroCount zeroCount --> fetch fetch --> audio audio --> segment segment --> fetch segment -->|endpoint| check check --->|good| pass pass --> incrCount incrCount --> fetch validate --->|bad| fail1 fail1 --> skipBad skipBad --> validate validate --> user1 check -->|bad| fail0 fail0 --> skip skip --> fetch fetch -->|STREAM_END| validate user1 ---->|yes| enrolled enrolled --> enroll enroll --> progress progress --> enroll enroll --->|complete| adapted adapted --> done ``` Offline enrollment flow. 1. If [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) `== NULL`, invoke [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next). The application should set [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) before starting enrollment, or do so in the [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next) callback. 2. If [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) `== NULL`, start model adaptation at step 7. 3. Reset the enrollment `count` to `0`. This tracks the number of usable enrollments for the current user or phrase. 4. Segment audio with a VAD (UDT) or a wake word (EFT). * Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). * Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). * Process and repeat until a speech segment is found or [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end). * If [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end), validate at step 6. 5. Check enrollment audio quality. * If good, invoke [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass) and keep the recording. * If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) and discard the recording. * Start the next recording at step 4. 6. Validate all the enrollment recordings, checking for consistency. * If no bad recordings remain, start enrolling the next user or phrase at step 2. * If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail), remove the recording and revalidate. 7. When all users / phrases are available, invoke [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled). 8. Train a new recognizer with the enrollments * Call [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress) repeatedly until done. 9. Invoke [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted). 10. Invoke [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done). **Also see these related items:** [spot-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-enroll.md#spot-enroll-code) ## Settings **Available events:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled), [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail), [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next), [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass), [^pause](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pause), [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress), [^resume](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#resume), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** [enrollment-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#enrollment-iterator), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** _none_ **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [add-context](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#add-context), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [re-adapt](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#re-adapt), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) **Available configuration settings:** [accuracy](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#accuracy), [enrollment-task-index](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#enrollment-task-index), [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive), [req-enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#req-enroll) **Available values:** [enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code) *[API]: Application Programming Interface *[EFT]: Enrolled Fixed Trigger: fixed wake words adapted to a speaker to improve accuracy *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets *[VAD]: Voice Activity Detector --- source_path: "models/types/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/" --- # Model types The [type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) of a model specifies the runtime behavior: what it does, which [setting keys](https://doc.sensory.com/tnl/7.8/api/setting-keys/index.md#setting-keys) it supports, and when it invokes [event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) callbacks. These SDKs support both [fundamental](https://doc.sensory.com/tnl/7.8/models/types/index.md#fundamental-models) and [composed](https://doc.sensory.com/tnl/7.8/models/types/index.md#composed-models) model types. Fundamental types include wake words, adapting wake words, models that create wake words through user enrollment, VAD, LVCSR, and STT. Templates add features by composition, combining multiple fundamental models into one. The TrulyHandsfree SDK supports all wake word, wake word enrollment, and VAD models. TrulyNatural (Lite) includes TrulyHandsfree and support for LVCSR. TrulyNatural STT includes TrulyNatural (Lite) and adds speech to text. ## Fundamental models [Wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) - Fixed and enrolled wake words, and keyword spotted command sets. [Adapting wake word](https://doc.sensory.com/tnl/7.8/models/types/ca.md#ca-type) - Fixed wake word models that continuously adapt to speakers' voices to improve false-accept rates. [Wake word enrollment](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type) - Adapts fixed (EFT) and user-defined (UDT) wake words to speakers' voices, creating [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models specific to these speakers. [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) - Finds the start- and endpoints of speech segments in a stream of audio data. [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) _(TrulyNatural only)_ - These recognizers use a phonetic acoustic model and an FST vocabulary decoder. They are suitable for small to medium vocabulary tasks, but not for audio transcription [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) _(STT only)_ - Audio transcription with transformers. ## Composed models [Templates](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) add behavior to the fundamental model types listed above. Use these, for example, to create a single model that waits for a keyword, runs a VAD, and then recognizes the segmented speech with an STT recognizer. This composed model uses the same API as a simple wake word and does not require application code changes. *[API]: Application Programming Interface *[EFT]: Enrolled Fixed Trigger: fixed wake words adapted to a speaker to improve accuracy *[FST]: Finite-State Transducer *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets *[VAD]: Voice Activity Detector --- source_path: "models/types/lvcsr.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/lvcsr/" --- # LVCSR _(TrulyNatural only)_ These recognizers use a phonetic acoustic model and an FST vocabulary decoder. They are suitable for small to medium vocabulary tasks, but not for unconstrained audio transcription. These models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) and filenames that by convention match `lvcsr-*.snsr` You can create LVCSR recognizers with [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) or by [specifying a grammar](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition) with build-capable[^1] model. LVCSR recognizers include support for decoding with statistical [language models], but Sensory does not distribute the tools used to create these[^2]. Language models can provide improved accuracy for constrained target domains. _For transcription type tasks, an STT model is a better fit._ The Sensory FST decoder supports hybrid models that contain both grammar-based and language model components. **Also see these related items:** [LVCSR models](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-models) included in this distribution. [^1]: LVCSR models created by [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) include build components only if the grammar references at least one user-defined class, such as `~dynamic-1`. If the grammar contains no unresolved classes VoiceHub removes the build components to reduce model files size and RAM use. [^2]: Contact your [sales representative](https://doc.sensory.com/tnl/7.8/contact.md#sales) if you would like to explore using a custom language model for your application. ## Operation ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) process partial(^result-partial) intent(^nlu-intent) slot(^nlu-slot) result(^result) nlu{NLU
match?} start --> fetch fetch --> audio audio --> process process --> fetch process -->|hypothesis| partial partial --> fetch process -->|VAD endpoint
or STREAM_END| nlu nlu -->|yes| intent nlu -->|no| result intent --> slot slot --> result slot -->|more| intent result --> fetch ``` Recognition flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition hypotheses every [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval) ms. 5. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), or an external [VAD](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) detects a speech endpoint. 6. If NLU is configured, invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each top-level result that matches. 7. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis. 8. Resume processing from step 1. **Note:** LVCSR recognizers do **not** produce a final recognition hypothesis until they run out of audio samples to process, or an external VAD detects a speech endpoint. With live audio you should use these with a VAD template such as [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr), or [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad-lvcsr). ## Settings **Available events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** _none_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream), [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream) **Available configuration settings:** [ac-prune-top-k](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#ac-prune-top-k), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [complete-only](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#complete-only), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [ram-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#ram-limit), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [search.frame-nota](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#searchframe-nota), [show-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#show-silence) **Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) ## Notes Sensory optimizes hybrid models with a background component only to detect speech that is not in the specified grammar. These models report an [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) of `background` when they detect out-of-grammar utterances. You should not use the out-of-grammar recognition [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) result as this will have a high word error rate. Consider using [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) for transcription tasks instead. [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream) provides a convenient way to specify a recognition vocabulary from an exhaustive list of alternative utterances. ## Grammar-based recognition Sensory's LVCSR models use [grammars](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) to constrain the possible utterances they can recognize. Focussing on a limited set of words and structures defined in these grammars improves recognition speed and accuracy at the expense of recognizing arbitrary input. You can create a custom recognizer by specifying a fixed grammar during development if the recognition vocabulary is entirely known, or at runtime if it is not. You can also use a hybrid approach and build the invariant parts during development, and delay adding [variable parts](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-classes) (such as a list of favorite TV channels) until runtime. ### Creating a recognizer Create a grammar-based recognizer using the [command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools). This example uses _data/grammars/enrollments.txt_ which contains a sample grammar specification for the enrollment recordings in _data/enrollments/_. To create a custom recognizer using this grammar with [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit), specify an LVCSR model that supports building and [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream). ### Details: _data/grammars/enrollments.txt_ ``` # LVCSR grammar specification for test utterances in data/enrollments/ # # In a tpl-spot-vad-lvcsr pipeline the prefix would be consumed by the spotter. prefix = armadillo | jackalope | terminator; # List of known utterances in the *-c.wav files. sentence = 18 percent of 643 | call the nearest target | how far away is winco | play more songs by this artist | record a video | start a timer for 20 minutes | i'm running low on gas | cancel all my meetings on friday | directions to susan's house | do i have any new texts | open my calendar to next week | set an alarm for 6 am tomorrow; # Match the prefix and zero or one of the sentences. # ~~and~~ are sentence start and end markers that # match silence and small amounts of extraneous speech. g = ~~$prefix $sentence?~~ ; ``` ```console % cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0 % bin/snsr-edit -vv -t model/lvcsr-build-enUS-14.0.2-5MB.snsr \ -f grammar-stream data/grammars/enrollments.txt \ -o lvcsr-enrollments.snsr Loading "model/lvcsr-build-enUS-14.0.2-5MB.snsr" as the template model. Loading "data/grammars/enrollments.txt" into setting "grammar-stream". Saved edited model to "lvcsr-enrollments.snsr". ``` Run the new model with [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval): ```console % bin/snsr-eval -t lvcsr-enrollments.snsr \ -s partial-result-interval=0 \ # (1)! data/enrollments/armadillo-1-3-c.wav 165 2745 armadillo play more songs by this artist ``` 1. [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval)` = 0` shows only the final recognition hypothesis. For small grammars such as this the build time is negligible. [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval) can build and run the recognizer in a single operation: ```console % bin/snsr-eval -t model/lvcsr-build-enUS-14.0.2-5MB.snsr \ -f grammar-stream data/grammars/enrollments.txt \ -s partial-result-interval=0 \ data/enrollments/armadillo-1-3-c.wav 165 2745 armadillo play more songs by this artist ``` ### Classes A symbol that starts with the tilde `~` sigil specifies a [recognition class](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax-class). Class recognizers have their own grammar specifications, separate from the top-level grammar. The behavior of a class-based recognizer is similar to that specified by a rule. Classes, however, can be updated without recompiling the rest of the grammar, and all references to a class use the same recognizer. This can reduce the recognizer size and improve build speed. This example uses a modified enrollment grammar which references two toy classes: `~number` and `~place`: **`enrollments-class.txt`** ``` # LVCSR grammar specification for test utterances in data/enrollments/ # This references two class sub-recognizers: ~number and ~place # # In a tpl-spot-vad-lvcsr pipeline the prefix would be consumed by the spotter. prefix = armadillo | jackalope | terminator; # List of known utterances in the *-c.wav files. sentence = ~number percent of ~number | call the nearest ~place | how far away is ~place | play more songs by this artist | record a video | start a timer for ~number minutes | i'm running low on gas | cancel all my meetings on friday | directions to ~place | do i have any new texts | open my calendar to next week | set an alarm for ~number am tomorrow; # Match the prefix and zero or one of the sentences. # ~~and~~ are sentence start and end markers that # match silence and small amounts of extraneous speech. g = ~~$prefix $sentence?~~ ; ``` **`place.txt`** ``` # Example place name class recognizer. g = target | winco | susan's house; ``` The `~number` and `~place` classes referenced in _enrollments-class.txt_ create two new dynamic settings for these classes: `grammar-stream.number` and `grammar-stream.place`. Specify these to create a complete recognizer: ```console % snsr-edit -v -t model/lvcsr-build-enUS-14.0.2-5MB.snsr\ -f grammar-stream enrollments-class.txt \ -g grammar-stream.number "g = 18 | 643 | 20 | 6;" \ # (1)! -o lvcsr-enrollments-class.snsr Output written to "lvcsr-enrollments-class.snsr". ``` 1. [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit)'s `-g` option sets the [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream)`.number` stream to a string argument. A file can also be used for the number grammar. Run the recognizer: ```console % snsr-eval -v -t lvcsr-enrollments-class.snsr \ -s partial-result-interval=0 \ data/enrollments/armadillo-1-0-c.wav 375 3150 (1.863e-08) armadillo 18 percent of 643 ``` ### Class libraries TrulyNatural 6.15.0 introduced support for pre-built binary class repositories. These contain classes built from frequently used grammar fragments such as dates, times, and numbers. Load binary class repositories into the same [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) as an LVCSR model to add this capability to the model. If a grammar references a class that's not explicitly defined, the class name is looked up in the provided class library or libraries. System class libraries provided by Sensory use a prefix of `s.` for all class names. See [lvcsr-lib-enUS-14.0.2.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-lib-enUS) for a description of the classes used below. **`class-lib.txt`** ``` # Example recognizer with classes from a class library call = call {number ~s.phone-number}; emergency = ~s.call-emergency; timer = {timer ~s.timer-phrases}; commands = {call} | {emergency} | $timer; g = ~~$commands~~ ; ``` This example uses live audio, so it needs [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval)'s `-a` flag to add a [VAD](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type) to find the end of each utterance and signal the recognizer to produce a final hypothesis. ```console % snsr-eval -a -t model/lvcsr-build-enUS-14.0.2-5MB.snsr \ -t model/lvcsr-lib-enUS-14.0.2.snsr \ -f grammar-stream class-lib.txt \ -s partial-result-interval=0 # Say: Call 1 800 555 1212 NLU intent: call (0) = call one eight hundred five five five one two one two NLU entity: number (0) = one eight hundred five five five one two one two 3360 6855 call one eight hundred five five five one two one two # Say: Set a timer for 31 minutes. NLU intent: timer (0) = set a timer for thirty one minutes 14610 16770 set a timer for thirty one minutes # Say: Call the fire department. NLU intent: emergency (0) = call the fire department 24540 25890 call the fire department ``` **C/C++** Configuring class-based recognition with the C API: ```c SnsrSession s; snsrNew(&s); snsrLoad(s, snsrStreamFromFileName("model/tpl-vad-lvcsr-3.17.0.snsr", "r")); snsrSetStream(s, SNSR_SLOT_0, snsrStreamFromFileName("model/lvcsr-build-enUS-14.0.2-5MB.snsr", "r")); snsrLoad(s, snsrStreamFromFileName("model/lvcsr-lib-enUS-14.0.2.snsr", "r")); snsrSetStream(s, SNSR_GRAMMAR_STREAM, snsrStreamFromFileName("class-lib.txt", "r")); if (snsrRC(s) != SNSR_RC_OK) { fprintf(stderr, "ERROR: %s\n", snsrErrorDetail(s)); return snsrRC(s); } ``` **Java** Configuring class-based recognition with the Java API: ```java SnsrSession s = new SnsrSession(); try { s.load(SnsrStream.fromFileName("model/tpl-vad-lvcsr-3.17.0.snsr", "r")); s.setStream(Snsr.SLOT_0, SnsrStream.fromFileName("model/lvcsr-build-enUS-14.0.2-5MB.snsr", "r")); s.load(SnsrStream.fromFileName("model/lvcsr-lib-enUS-14.0.2.snsr", "r")); s.setStream(Snsr.GRAMMAR_STREAM, SnsrStream.fromFileName("class-lib.txt", "r")); } catch (IOException e) { e.printStackTrace(); return s.rC(); } ``` **Python** Configuring class-based recognition with the Python API: ```python try: with snsr.Session() as s: s.load("model/tpl-vad-lvcsr-3.17.0.snsr") s.set_stream( snsr.SLOT_0, snsr.Stream.from_filename("model/lvcsr-build-enUS-14.0.2-5MB.snsr", "r"), ) s.load("model/lvcsr-lib-enUS-14.0.2.snsr") s.set_stream( snsr.GRAMMAR_STREAM, snsr.Stream.from_filename("class-lib.txt", "r"), ) except snsr.Error as e: print(f"ERROR: {e.message}") ``` ### Syntax A [context-free grammar] is a set of rules that describes the sequences of words that an LVCSR model can recognize. #### Definition 1. Grammars use [UTF-8][] encoding. 1. `#` marks the start of a comment, which extends to the end of the line. 1. A _grammar_ is a series of _rules_ representing variable definitions. The final rule in a grammar specifies the recognition vocabulary and typically references rules defined earlier. It should include the sentence start (`~~`) and end (`~~`) markers. 1. A _rule_ is an assignment of the form `name = expr ;` where `name` is a _symbol_ and `expr` is a sequence of _symbols_ and _operators_. `expr` is a type of [regular expression][]. 1. A _symbol_ is a sequence of characters that does not include any whitespace or operators, optionally prefixed by sigils `$` or `~`. A symbol without a sigil is called a _terminal_ and is part of the recognition vocabulary, for example `temperature`. [Special symbols](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-special) are predefined terminals that describe input characteristics such as pauses and the edges of an utterance. 1. The `$` sigil does rule substitution _at build time_. The parser substitutes the value of the rule named `name` for `$name`. Substitutions include an implicit _grouping_ operator: Grammar `a = 1 | 2 | 3; b = $a ;` is equivalent to `b = ~~(1 | 2 | 3)~~ ;`. 1. The `~` sigil substitutes a named recognition class _at runtime_. - Each class is a recognizer with its own grammar, separate from the main grammar. - All references to a class use instances of the same class recognizer. - You can update each class in isolation, without having to recompile the main grammar. - If you have a large rule that's referenced multiple times, converting it to a class can speed up build time significantly. - Use classes to augment a recognition vocabulary at runtime. In a voice dialing application, for example, one would define the entire recognition grammar at build time but use `~contacts` instead of a predefined list of contact names. Once loaded, the application would scan the address book and build only the `~contacts` class. - Specify class definitions with [grammar-stream.classname](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream) or [phrases-stream.classname](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream), for example `phrases-stream.contacts`. 2. Operators include _grouping_ parentheses, brackets, and braces, _infix_ operators that indicate logical AND and OR between symbols, and _postfix_ operators that change how the preceding symbol matches input. The [operator precedence](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-op-precedence) table lists the order and direction in which the parser applies operators. 3. Grouping - `( )` Parentheses enclose items that are grouped together. - `[ ]` Square brackets enclose optional items. `[...]` is equivalent to `(...)?`. - `{ }` Braces implement slot-capturing lightweight NLU markup. - `{slotName a b c}` makes `a b c` available as the [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) of [nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name) `slotName` when the recognizer matches `a b c` to the input audio. - You can nest NLU slots to an arbitrary depth. - The outermost slots are defined as [intents](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) and all the nested slots in each intent as [entities](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name). - Each identified intent invokes handlers registered for [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot). - `{rule}` is shorthand for `{rule $rule}`. - With this grammar: ``` seconds = 1 | 2 | 4 | 8 | half:0.5 a:? | a:? quarter:0.25 [of: a:]; shutterSpeed = set shutter speed to {seconds} ( second | seconds ); cmd = ~~{shutterSpeed}~~ ; ``` an utterance of "set shutter speed to a quarter of a second" will produce `set shutter speed to 0.25 second` as recognition output, with an additional [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) callback for the top-level `shutterSpeed` slot: ``` NLU intent: shutterSpeed (0) = set shutter speed to 0.25 second NLU entity: seconds (0) = 0.25 ``` 4. Infix operators - These are valid between symbols and may be surrounded by whitespace. - `^` is the conjunction operator and is implied between adjacent terminals: Grammar `g = one two three;` will recognize only the sequence "one two three". - `|` is the disjunctive operator. It separates alternative items. Grammar `g = one | two | three;` will recognize "one", or "two", or "three". 5. Postfix operators - These directly follow a symbol without any intervening whitespace. - `?` A question mark following a symbol makes that symbol optional: It requires zero or one repetitions of the symbol. - `+` A plus sign following a symbol or a group requires one or more repetitions of it. - `*` An asterisk following a symbol or a group requires zero or more repetitions. - `:` is the rewrite operator. - `left:right` recognizes symbol `left` but produces terminal `right` as a recognition result. - `left:` recognizes symbol `left` but rewrites that to an empty string, eliding `left` from the recognition result. - `:right` inserts `right` into the recognition result. If you say "one two three", grammar `g = ~~one :mississippi two :mississippi three~~ ;` produces "one mississippi two mississippi three". - `/` A forward slash following a symbol followed by a floating point number defines a weight to be associated with that symbol. If there's a rewrite operator (`:`) the slash must follow the rewritten-to terminal, for example: `one:een/0.123` Weights are in the logprob domain, convert from a $[0, 1]$ probability to a weight with $w = -log_{10}(p)$. The default symbol weight is `0` for a probability of `1.0`. 6. `\` escape symbol. To include a literal special character in a grammar specification, escape it with a backslash. The list of characters that support this include: `^`, `|`, `*`, `+`, `?`, `=`, `[ ]`, `( )`, `;`, `#`, and `:`. **Also see these related items:** [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream), [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream), [nlu-grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#nlu-grammar-stream), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) #### Operator precedence The following table lists the precedence and associativity of grammar operators. Operators are listed in descending precedence: level `0` is applied first and level `5` last. Precedence | Operator | Description | Associativity :---------:|:--------:|-------------|-------------- 0 | `:` | Rewrite output 0 | `/` | Symbol weight 1 | `( )` | Grouping 1 | `[ ]` | Optional group 1 | `{ }` | Slot-capturing semantic markup 2 | `?` | Zero-or-one symbol | left-to-right 2 | `+` | One-or-more symbols | left-to-right 2 | `*` | Zero-or-more symbols | left-to-right 3 | `^` | And, implied between symbols | right-to-left 4 | `|` | Alternative | right-to-left 5 | `=` | Rule assignment | right-to-left This grammar: ``` a = one | two three four; g = ~~( $a | five six)~~ ; ``` will recognize only these phrases: ``` one two three four five six ``` #### Special symbols A grammar can include these special symbols: - `~~` - The silence at the start of a sentence. - `~~` - The silence at the end of a sentence. - `` - Short pauses between words. The grammar compiler automatically adds these where needed, so there is no need to do so explicitly. Do **not** add `` to NLU grammars, use `` instead. - `` - A explicit short pause. - `` - Matches when none of the alternatives are likely (i.e. "none of the above"). + Recognition results at the phrase level can include `` even if this symbol was not explicitly used in the grammar. This is an indication that the result was rejected due to [search.frame-nota](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#searchframe-nota), or that RAM or CPU constraints limited the recognizer's ability to produce a result. - `` - Similar to ``. In *some* models the threshold for determining whether this symbol matches better than any other is different from that of ``. - `.` - When used with lightweight [NLU grammars](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#nlu-grammar-stream) a single period matches any input word. Use `.:*` to match any input words and remove them from the NLU result. [regular expression]: https://en.wikipedia.org/wiki/Regular_expression [UTF-8]: https://en.wikipedia.org/wiki/UTF-8 *[API]: Application Programming Interface *[FST]: Finite-State Transducer *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[RAM]: Random Access Memory *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/types/stt.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/stt/" --- # Speech To Text _(STT only)_ These models do audio transcription with transformers. STT models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) and filenames that by convention match `stt-*.snsr` **Also see these related items:** [STT models](https://doc.sensory.com/tnl/7.8/models/index.md#stt-models) included in this distribution. ## Operation ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) process[process] partial(^result-partial) intent(^nlu-intent) slot(^nlu-slot) result(^result) nlu{NLU
match?} slm{SLM
included?} generate[generate] slmstart(^slm-start) slmresultpartial(^slm-result-partial) slmresult(^slm-result) start --> fetch fetch --> audio audio --> process process --> fetch process -->|hypothesis| partial partial --> fetch process -->|VAD endpoint
or STREAM_END| nlu nlu -->|yes| intent nlu -->|no| result intent --> slot slot --> result slot -->|more| intent result --> slm slm -->|yes| slmstart slm -->|no| fetch slmstart -->|OK| generate slmstart -->|STOP| fetch generate -->|response| slmresultpartial slmresultpartial --> generate generate -->|done| slmresult slmresult --> fetch ``` Recognition flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition hypotheses every [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval) ms. 5. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), or an external [VAD](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) detects a speech endpoint. 6. If NLU is configured, invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each top-level result that matches. 7. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis. 8. If an SLM is not available, resume processing at step 1. 9. Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start). If the handler returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), resume processing at step 1. 10. Invoke [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) as the model generates text. 11. Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) when text generation is complete. 12. Resume processing at step 1. **Note:** STT recognizers do **not** produce a final recognition hypothesis until they run out of audio samples to process, or an external VAD detects a speech endpoint. With live audio you should use these with a VAD template such as [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr), or [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad-lvcsr). ## Settings **Available events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start) **Available iterators:** _none_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile) **Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) *[API]: Application Programming Interface *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[SLM]: Generative Small Language Model *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/types/vad.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/vad/" --- # VAD Models of this type find speech segments in audio data streams. Wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) and filenames that by convention match `vad-*.snsr` **Also see these related items:** [VAD models](https://doc.sensory.com/tnl/7.8/models/index.md#vad-models) included in this distribution. ## Operation ```mermaid flowchart TD start((start)) fetch0[/samples from ->audio-pcm/] fetch1[/samples from ->audio-pcm/] audio0(^sample-count) audio1(^sample-count) silence(^silence) begin(^begin) END(^end) limit(^limit) process0[process] process1[process] out[\samples to <-audio-pcm\] final@{ shape: f-circ } start --> fetch0 fetch0 --> audio0 audio0 --> process0 process0 --> fetch0 process0 -->|speech start| begin process0 -->|timeout| silence silence --> final begin --> fetch1 fetch1 --> audio1 audio1 --> out out --> process1 process1 --> fetch1 process1 -->|speech end| END process1 -->|speech limit| limit END --> final limit --> final final --> fetch0 ``` Endpointing flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. If speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms continue at step 6. 4. If _no_ speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and restart from step 1. 5. Continue processing at step 1 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end). 6. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin). 7. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 8. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 9. If [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `== 1` write speech samples to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out). 10. If end detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and restart from step 1. 11. If end _not_ detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) and restart from step 1. 12. Continue processing at step 7 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) **Available iterators:** _none_ **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms), [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample) **Available configuration settings:** [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through), [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence) **Available values:** [vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) **Also see these related items:** [live-segment.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment.md#live-segment-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "models/types/wake-word.md" canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/wake-word/" --- # Wake word Fixed and enrolled wake words, and command sets. Wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) and filenames that by convention match `spot-*.snsr` You can create custom wake words and command sets with [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) or [wake word enrollment](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type). **Also see these related items:** [Wake word models](https://doc.sensory.com/tnl/7.8/models/index.md#wake-word-models) included in this distribution. ## Operation ```mermaid flowchart TD start((start)) fetch[/samples from ->audio-pcm/] audio(^sample-count) process[process] result(^result) start --> fetch fetch --> audio audio --> process process --> fetch process -->|recognize| result result --> fetch ``` Recognition flow. 1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm). 2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event). 3. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase. 4. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok). Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in. ## Settings **Available events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event) **Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator) **Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last) **Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream) **Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold) **Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) **Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code) *[API]: Application Programming Interface *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "reference/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/reference/" --- # Reference This section covers the TrulyNatural SDK product — SDK variants and supported platforms, command-line tools, the supplied models and model types, licensing, and the changelog. For the programming interfaces, see the [API reference](https://doc.sensory.com/tnl/7.8/api/index.md#api-reference). [Overview](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-overview) - **Start here.** SDK variants, development host requirements, supported target platforms, models, tools, and license keys. [Command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools) - Utilities for running and constructing models. [Models](https://doc.sensory.com/tnl/7.8/models/index.md#models) - Sample models included in this distribution. [Model types](https://doc.sensory.com/tnl/7.8/models/types/index.md#model-types) - Descriptions of various model types and their behaviors. [Licenses](https://doc.sensory.com/tnl/7.8/licenses/index.md#sensory-sdk-license) - Sensory and third-party legal agreements. [Changelog](https://doc.sensory.com/tnl/7.8/changes/index.md#v7-changes) - Changes by TrulyNatural SDK version. [How to upgrade](https://doc.sensory.com/tnl/7.8/upgrade.md#how-to-upgrade) - Change to a different SDK type or upgrade to a newer version. [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) - Guide to selecting the appropriate format for models created with [Sensory's VoiceHub portal][vh]. [Contact information](https://doc.sensory.com/tnl/7.8/contact.md#contact) - How to get in touch with Sensory. [vh]: https://www.sensory.com/voicehub/ "Create a custom voice recognizer quickly and easily" *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "reference/overview.md" canonical_url: "https://doc.sensory.com/tnl/7.8/reference/overview/" --- # Overview This section provides a brief overview of this SDK: Features supported by [variant](https://doc.sensory.com/tnl/7.8/reference/overview.md#variants), development host [requirements](https://doc.sensory.com/tnl/7.8/reference/overview.md#requirements), [supported target platforms](https://doc.sensory.com/tnl/7.8/reference/overview.md#supported-target-platforms), `snsr` [model](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-models) files, command-line [tools](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-tools), and the software [license keys](https://doc.sensory.com/tnl/7.8/reference/overview.md#license-keys) used to control library features. ## Variants The TrulyHandsfree, TrulyNatural (Lite), and TrulyNatural STT SDKs differ **only** in the types of models they support. The APIs, model formats, tools, etc. are identical. TrulyNatural STT is a strict superset of TrulyNatural (Lite), which in turn is a strict superset of TrulyHandsfree. **[TrulyNatural STT][tnl-stt]:** * [x] Speech-To-Text with transformers and compressed language models. * [x] Recognition hypotheses include punctuation and capitalization. * [x] Machine-learned NLU for intent and entity identification. * [x] Generative language models. * [x] **Sensory has models available for 35 languages**, each in multiple sizes (for best accuracy given a CPU cycle budget). Contact your account representative or [Sensory Sales](https://doc.sensory.com/tnl/7.8/contact.md#sales) for details. * [x] _Includes [Open Source software](https://doc.sensory.com/tnl/7.8/licenses/oss.md#open-source-licenses)._ * [x] Features available in TrulyNatural STT only are flagged with _(STT only)_ **[TrulyNatural Lite][tnl-lite]:** * [x] Phonemic acoustic models with FST vocabulary decoding. * [x] [Grammar-based](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition) medium vocabulary command and control. * [x] Grammar-based NLU for intent and entity identification. * [x] Tools to build recognizers from grammars or phrase lists. * [x] API to build or augment recognizers at runtime. * [x] Runs [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) "large natural language vocabulary" models. * [x] Support for devices with limited RAM (< 1 MiB) and CPU (< 500 MHz). * [x] _No third-party or Open Source software._ * [x] Features available in TrulyNatural (Lite and STT) only are flagged with _(TrulyNatural only)_ **[TrulyHandsfree][thf]:** * [x] [Fixed](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), [enrolled](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type) and [adapting](https://doc.sensory.com/tnl/7.8/models/types/ca.md#ca-type) wake words. * [x] Command sets, which are keyword spotter recognizers for multiple (up to twenty) active phrases. * [x] [VAD](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad). * [x] [Command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools) to enroll and evaluate wake word models, and to convert wake word models into Sensory's [THF Micro][] DSP format. * [x] Runs [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) "wake word" and "simple commands" projects. * [x] _No third-party or Open Source software._ ## Requirements For development, you'll need: - macOS, x86_64 Linux, or Windows (version 10 or later, [Microsoft Visual Studio][msvc] 2022) development machine. - iOS: [Xcode] 26.5 or later. - Java: [Java JDK][jdk] 11 through 21. - Android: [Android Studio Panda][as] 2025.3.4 or later. [API level 21][api-levels] or later. **Verified with:** TrulyNatural SDK 7.9.0-pre.0 was verified against **Xcode 26.5** and **Android Studio Panda 4 | 2025.3.4 Patch 1**. Newer point releases are expected to work but are not part of the release-test matrix. Models require audio encoded as 16-bit LPCM and sampled at 16 kHz. For optimal recognition accuracy, ensure that the dynamic range of the input audio spans **at least** 12 bits (-24 [dBFS][] peak-to-peak, sample values from -2048 to 2047) and that no clipping is present. ## Supported target platforms TrulyHandsfree and TrulyNatural run on hundreds of different operating systems and CPU combinations. This distribution includes a subset of these in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/_. See the `README` files in the platform subdirectories for additional details, such as the toolchain and compiler flags used to build the library. TrulyNatural STT is available for Android, iOS on `arm64` and `arm64e`, macOS, Linux on `x86_64`, `aarch64`, and `arm`, and Windows on `x86_64`. [Contact](https://doc.sensory.com/tnl/7.8/contact.md#contact) Sensory if your target platform isn't listed. Platform { data-sort-default }| STT support| Note :-----------------------------|:----------:|:---- `aarch64-linux-gnu` | • yes | [GLIBC][] >= 2.33 `arm-linux-gnueabi` | • no | [GLIBC][] >= 2.17 `arm-linux-gnueabihf` | • yes | [GLIBC][] >= 2.33 `arm-none-eabi` | • no `arm-none-eabihf` | • no `arm-none-eabihf-ethosu` | • no `armv6-linux-gnueabihf` | • no | [GLIBC][] >= 2.17 `i686-linux-gnu` | • no | [GLIBC][] >= 2.17 `ios` | • yes | 64-bit only `android` | • yes | [API level][api-levels] >= 21 `macos` | • yes `mipsel-buildroot-linux-uclibc` | • no `mipsel-openwrt-linux-musl` | • no `x86_64-linux-gnu` | • yes | [GLIBC][] >= 2.17 `x86_64-windows-msvc` | • yes | Requires [MSVC Runtime][] 2022 *Included target platform libraries* ## Models TrulyNatural SDK `.snsr` files include all the models and settings required for a task, and a flow graph that defines the behavior. A task can be as simple as a single-phrase [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), or something more complicated such as wake word followed by a VAD and an STT recognizer that transcribes the detected speech segment. If you're just interested in the final recognition results, the code required to run these two examples is identical. This distribution includes sample [models](https://doc.sensory.com/tnl/7.8/models/index.md#models) and [templates](https://doc.sensory.com/tnl/7.8/models/index.md#templates) used to add additional behaviors to these. ## Tools The _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/bin/_ directory contains a number of [command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools). These evaluate models, compose new models, modify settings, enroll wake words, convert wake word models to [THF Micro][] DSP format, and diagnose audio recording quality. These utilities are compiled for the development host. You can [compile these from source](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#c-examples) for other platforms. ## License keys The TrulyNatural SDK installer embeds the license key entered on the "Product Licensing" page in the libraries and tools it installs. All applications that link against these libraries include this license key. Keys include the SDK licensee name. License keys control access to specific SDK features, target platforms, CPU architectures, and to specify an expiration date for access. Model files also include license keys. These are validated upon loading. License keys fall into two broad categories: _development_ ones which either expire at some future date or limit use, and _production_ keys which do not expire and do not have usage limits. You can use the [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license) option to apply an updated license key with the [configuration](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) API at runtime. **Warning:** Do _not_ use development / expiring keys in shipping products. These will stop working when the keys expire. [Contact](https://doc.sensory.com/tnl/7.8/contact.md#contact) Sensory to obtain production-ready libraries and models. **Also see these related items:** [license-exp-date](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-date), [license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-message), [license-exp-warn](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-warn), [model-license-exp-date](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-date), [model-license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-message), [model-license-exp-warn](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-warn), [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license), [LICENSE_NOT_VALID](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [LICENSE_LIMIT_EXCEEDED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported), [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled) [api-levels]: https://en.wikipedia.org/wiki/Android_version_history "Android version history and API levels" [as]: https://developer.android.com/studio/index.html "Android Studio" [dBFS]: https://en.wikipedia.org/wiki/DBFS "Decibels relative to full scale" [GLIBC]: https://sourceware.org/glibc/wiki/Glibc%20Timeline "GNU C Library Release Timeline" [jdk]: https://adoptium.net "Java Development Kit" [msvc]: https://visualstudio.com/ "Microsoft Visual Studio" [MSVC Runtime]: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#visual-studio-2015-2017-2019-and-2022 "Microsoft Visual C++ Redistributable" [thf]: https://www.sensory.com/wake-word/ "Low Power Wake Words & Phrase Recognition Engine" [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" [tnl-lite]: https://www.sensory.com/natural-language-understanding/ "Large Vocabulary Continuous Speech Recognition (LVCSR) with Dynamic Natural Language Understanding" [tnl-stt]: https://www.sensory.com/embedded-speech-to-text/ "Embedded Speech To Text" *[API]: Application Programming Interface *[FST]: Finite-State Transducer *[LPCM]: Linear pulse-code modulation *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[NLU]: Natural Language Understanding model *[OSS]: Open-source software *[RAM]: Random Access Memory *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[VAD]: Voice Activity Detector --- source_path: "reference/voicehub.md" canonical_url: "https://doc.sensory.com/tnl/7.8/reference/voicehub/" --- # VoiceHub [Sensory's VoiceHub][vh] is a web portal that provides a convenient interface for developers to prototype and experiment with wake words, language models and natural language understanding. Users can build custom wake words, voice control command sets, and create grammar-based language models with flexible intents and entities. VoiceHub uses Sensory's [TrulyHandsfree][thf] for wake words and spotted commands, and [TrulyNatural][tnl-lite] for grammar-based recognition with natural language markup to identify intents and entities. ## Output format selection VoiceHub can deliver recognizer models in various formats, as specified by the `Output Format` selector. If you want to use such a model with the TrulyHandsfree or TrulyNatural SDKs you should select the `THF/TNL SDK: snsr file` option. This is the default for new projects. If you are using TrulyHandsfree or TrulyNatural on a small embedded platform, you should select one of the alternate output formats described below. **`THF/TNL SDK: snsr file`** _(recommended)_ - This is the standard TrulyNatural model format. Use this unless you will be running the model on an embedded platform with limited CPU cycles and available RAM. **`THF/TNL SDK: snsr file (low memory use)`** - This optimizes the model for small platforms with limited CPU cycles (< 500 MHz) and RAM (< 1 MiB of heap). The reduced heap and CPU requirements come at the expense of a bit of recognition accuracy. **`THF/TNL SDK: .c file (low memory use)`** - Similar to `THF/TNL SDK: snsr file (low memory use)` above, but also includes a model converted to C code that you can compile into your application. On platforms with read-only / flash memory this reduces the amount of RAM required by the size of the model file. - VoiceHub uses [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) to create two C files from the `snsr` model: - `snsr-edit -c voicehub -t model.snsr` to create _model.c_, and - `snsr-edit -i -t model.snsr -o model-custom-init.c` to create _model-custom-init.c_. This file includes custom initialization code that elides unused modules at link time to [reduce overall application size](https://doc.sensory.com/tnl/7.8/faq.md#reduce-code-size). **`THF/TNL SDK: .c file (low memory use for ST Micro STM32H7)`** - Similar to `THF/TNL SDK: .c file (low memory use)`, but also includes TrulyNatural SDK libraries for use on the STMicroelectronics [STM32H7][] series microcontrollers. **`Embedded: Arm Cortex-M55/M85 Ethos-U55-128`** - Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-128`. - Use this on the [Alif Ensemble][] family of microcontrollers. **`Embedded: Arm Cortex-M55/M85 Ethos-U55-256`** - Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-256`. - Use this on the [Alif Ensemble][] family of microcontrollers. **`Embedded: Infineon Arm Cortex-M55/M85 Ethos-U55-128 (model in RAM)`** - Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-128`. - The compiled model code runs from RAM. This requires more available heap than the `(model in ROM/Flash)` option below. Use this only if the flash read speed is too low to allow the model to run in real time. - Use this only on Infineon microcontrollers. **`Embedded: Infineon Arm Cortex-M55/M85 Ethos-U55-128 (model in ROM/Flash)`** - Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-128`. - The compiled model code runs from code space. - Use this only on Infineon microcontrollers. [Alif Ensemble]: https://alifsemi.com/products/ensemble/ "The Alif Ensemble family of Arm-based 32-bit microcontrollers" [Ethos-U55]: https://developer.arm.com/Processors/Ethos-U55 "Arm Ethos-U NPU family" [STM32H7]: https://www.st.com/en/microcontrollers-microprocessors/stm32h7-series.html [thf]: https://www.sensory.com/wake-word/ "Low Power Wake Words & Phrase Recognition Engine" [tnl-lite]: https://www.sensory.com/natural-language-understanding/ "Large Vocabulary Continuous Speech Recognition (LVCSR) with Dynamic Natural Language Understanding" [Vela]: https://developer.arm.com/documentation/109267/0102/Tool-support-for-the-Arm-Ethos-U-NPU/Ethos-U-Vela-compiler "Ethos-U Vela compiler" [vh]: https://www.sensory.com/voicehub/ "Create a custom voice recognizer quickly and easily" *[RAM]: Random Access Memory *[ROM]: Read-Only Memory, typically nonvolatile flash memory *[SDK]: Software Development Kit *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/audio-check.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/audio-check/" --- # audio-check This tool runs checks on the audio for problems such as all-zero runs or clipping. Also estimates signal-to-noise ratio. The audio file should be a WAV file, mono, 16 KHz. ## Usage ``` Reports audio file quality. usage: audio-check wavfile options: -v [-v [-v]] : increase verbosity ``` ## Example ```console % audio-check sampleAudio.wav Clipping/Saturation: No clipping / saturation - OK Flat Waveform: Problem: Flat for 1 msec. Signal-to-Noise Ratio Problem: low signal-to-noise ratio. SNR estimate: 9.19 dBA (poor) ``` *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/index.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/" --- # Command-line tools The TrulyNatural SDK includes a number of command-line utilities. Find executables for the host platform in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/bin/_ ## Tools [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval) - Evaluates / runs TrulyNatural SDK `.snsr` model files. [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) - Edits/modifies TrulyNatural SDK `.snsr` model files. [spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll) - Enrolls TrulyNatural SDK wake words on audio files. [live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll) - Enrolls TrulyNatural SDK wake words on live audio. [snsr-eval-batch](https://doc.sensory.com/tnl/7.8/tools/snsr-eval-batch.md#snsr-eval-batch) - Runs a TrulyNatural SDK `.snsr` model file on test data and reports the false accept rate, false reject ratio, optional word-error rate, and execution speed [spot-convert](https://doc.sensory.com/tnl/7.8/tools/spot-convert.md#spot-convert) - Converts TrulyNatural SDK wake word models to [THF Micro][] format. [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) - Splits spotter log files into an event log, captured audio data, and the source spotter model. [audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check) - Reports audio file quality. [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" *[SDK]: Software Development Kit *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/live-enroll.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/live-enroll/" --- # live-enroll Interactive command-line phrase spotter enrollment, using the default audio capture device. **Also see these related items:** [live-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-enroll.md#live-enrollc) ## Usage ``` Enrolls TrulyNatural SDK wake words on live audio. usage: live-enroll -t task [options] +user1 [+user2 ...] [file ...] options: -e enrollments : enrollment context output filename -o out : enrolled model output filename (default: enrolled-sv.snsr) -p prefix : capture each enrollment to file as --{pass,fail}-.wav -s setting=value : override a task setting -t task : specify task filename (required) -v [-v [-v]] : increase verbosity Enrollment audio is captured from the default microphone, unless the optional [file ...] arguments are supplied. Settings are strings used as keys to query or change task behavior. Most frequently used for enrollment is accuracy, which takes a value between 0 and 1. Refer to the TrulyNatural SDK documentation for a complete list and descriptions of all supported settings. ``` ## Examples Enroll two phrases interactively on a Raspberry Pi 3. ```console % cd sample/c % make -s -j4 all % bin/live-enroll -v -t ../../model/udt-universal-3.67.1.0.snsr \ +hey-sensory +hello-voice-genie Say the enrollment phrase (1/4) for "hey-sensory" Recording: 3.46 s Preliminary enrollment checks passed. Say the enrollment phrase (2/4) for "hey-sensory" Recording: 3.41 s Preliminary enrollment checks passed. Say the enrollment phrase (3/4) for "hey-sensory" with context, for example: " will it rain tomorrow?" Recording: 4.30 s This enrollment recording is not usable. Reason: silence-begin Fix: Please wait for the prompt before speaking. Say the enrollment phrase (3/4) for "hey-sensory" with context, for example: " will it rain tomorrow?" Recording: 4.44 s Preliminary enrollment checks passed. Say the enrollment phrase (4/4) for "hey-sensory" with context, for example: " will it rain tomorrow?" Recording: 4.18 s Preliminary enrollment checks passed. Say the enrollment phrase (1/4) for "hello-voice-genie" Recording: 2.30 s Preliminary enrollment checks passed. Say the enrollment phrase (2/4) for "hello-voice-genie" Recording: 3.53 s Preliminary enrollment checks passed. Say the enrollment phrase (3/4) for "hello-voice-genie" with context, for example: " will it rain tomorrow?" Recording: 4.22 s Preliminary enrollment checks passed. Say the enrollment phrase (4/4) for "hello-voice-genie" with context, for example: " will it rain tomorrow?" Recording: 5.59 s Preliminary enrollment checks passed. Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Test: ```console % bin/snsr-eval -v -t ./enrolled-sv.snsr Using live audio from default capture device. ^C to stop. 1485 2175 (0.70) hey-sensory 4155 5085 (0.68) hello-voice-genie 7710 8685 (0.61) hello-voice-genie 10770 11535 (0.61) hey-sensory ^C ``` *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/snsr-edit.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-edit/" --- # snsr-edit This tool edits default task settings, and composes specialized tasks by filling template task slots with spotter models. **Also see these related items:** [snsr-edit.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-edit.md#snsr-editc) ## Usage ``` Edits/modifies TrulyNatural SDK .snsr model files. usage: snsr-edit -t task [options] options: -C tag-identifier : emit C source to load model into RAM -c tag-identifier : emit C source to run model from code space -e setting filename : extract task setting/slot into filename -f setting filename : load filename into task setting/slot -g setting value : load string into task setting -i : emit custom initialization code -o out : output filename -p : prune unused settings to reduce model size -q setting : query a task setting -s setting=value : override a task setting -t task : specify task filename (required) -v [-v [-v]] : increase verbosity Settings are strings used as keys to query or change task behavior. Most frequently used are operating-point for wake words and command sets, leading-silence and trailing-silence for VAD templates, partial-result-interval for LVCSR and STT, and stt-profile for STT models. Refer to the TrulyNatural SDK documentation for a complete list and descriptions of all supported settings. ``` ## Examples Query and change the default [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point). This creates a new file, _hbg-3.snsr_, which is a copy of _spot-hbg-enUS-1.4.0-m.snsr_ with a less accepting default operating point. See the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) task description for a list of valid setting names. ```console % snsr-edit -t spot-hbg-enUS-1.4.0-m.snsr -q operating-point operating-point = 10 % snsr-edit -t spot-hbg-enUS-1.4.0-m.snsr -s operating-point=3 -o hbg-3.snsr % snsr-edit -t hbg-3.snsr -q operating-point operating-point = 3 ``` Create a new spotter task model that runs a fixed phrase spotter and an enrolled spotter (user-defined or fixed-trigger) at the same time. _tpl-spot-concurrent-1.5.0.snsr_ is a template with two slots, named `0` and `1`. The combined model _fixed+udt.snsr_ is a standard [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) task that spots the vocabulary from _spot-hbg-enUS-1.4.0-m.snsr_ and _enrolled-sv-0.snsr_ at the same time. ```console % snsr-edit -v -t tpl-spot-concurrent-1.5.0.snsr\ -f 0 spot-hbg-enUS-1.4.0-m.snsr\ -f 1 enrolled-sv-0.snsr -o fixed+udt.snsr Saved edited model to "fixed+udt.snsr". ``` Convert a spotter model to C code. ```console % snsr-edit -v -c voicegenie -t spot-voicegenie-enUS-6.5.1-m.snsr Saved edited model to "spot-voicegenie-enUS-6.5.1-m.c". % head -20 spot-voicegenie-enUS-6.5.1-m.c ``` Create custom TrulyNatural initialization code that limits code references to those modules needed to run the _spot-hbg-enUS-1.4.0-m.snsr_ _spot-music-enUS-1.2.0-m.snsr_ models. Include the generated _snsr-custom-init.c_ file in your application build, and compile with `-DSNSR_USE_SUBSET`. This will reduce the application code size by limiting its capabilities to run snsr models to just the models used to create the custom initialization. See [Compile-time macros](https://doc.sensory.com/tnl/7.8/api/compile-macros.md#compile-time-macros). ```console % snsr-edit -v -i -t spot-hbg-enUS-1.4.0-m.snsr -t spot-music-enUS-1.2.0-m.snsr Output written to "snsr-custom-init.c". ``` *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology *[UDT]: User-Defined Trigger: enrolled wake words and command sets --- source_path: "tools/snsr-eval-batch.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-eval-batch/" --- # snsr-eval-batch This tool runs a [Wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) model over a (typically large) number of audio files to measure the performance in terms of the false accept (FA) rate, and the false reject (FR) ratio. Can also be used to measure command substitution or word-error-rate (WER) in LVCSR and STT. ## Test data requirements Audio files used for FR (in-vocabulary) testing: - Must contain a single target phrase utterance per file. - Must contain lead-in ambient audio before the target phrase begins. + In most cases one second of ambient audio will suffice. + For custom spotters, refer to the documentation delivered with the model for the exact requirements. + Most models created after May 2020 include a setting, `min-in-vocab-duration`, which specifies the minimum required lead-in time in milliseconds. You can query this with `snsr-edit -t model.snsr -q min-in-vocab-duration` + Recognition events that happen during the required lead-in time are counted as errors. See `INVFA` in the log file format table below. + You can override the minimum lead-in requirement on the command-line (with `-s min-in-vocab-duration=0`), but doing so means you will be testing the model outside of its intended operating environment. - The FR ratio is calculated as the fraction of the in-vocabulary files that the spotter model did not find the phrase in, expressed as a percentage. Example: Out of 2000 files, 120 did not trigger the spotter. The false-reject ratio is therefore 6.0%. - If reference phrase checking is used, then mismatches will be noted as substitutions (SB code) and be included in the FR count and ratio. - If word-error-rate is used, then the total words, substitutions, additions and deletions in each phrase will be noted. The total count for each across the entire test set will be reported also. Audio files used for FA testing: - Should be much longer than the in-vocabulary examples. - Should contain a selection of noise expected to be encountered during regular use. - Must not contain explicit instances of the target phrase. - The FA rate is calculated as the average number of times the spotter model mistakenly triggered per hour. Example: Out of 120 hours of audio, the spotter triggered 60 times. The false-accept rate is 0.5 / hour. - If you run `snsr-eval-batch` with the `-u` flag, unexpected recognition events from the FR testing files are included in the false accept totals. These unexpected events include: + Spots that happen during the required lead-in period. + The second and all subsequent spots, as each in-vocabulary file must contain only a single target phrase utterance. - FA testing can only be done on wake words. Commands, LVCSR and STT are not continuous listening technologies and FA testing is not relevant here. ## Usage ``` Runs a TrulyNatural SDK wake word model file on test data and reports the false accept rate, false reject ratio, and execution speed. usage: snsr-eval-batch -t task [options] options: -a : Add tpl-vad-lvcsr to LVCSR and STT models -c filename : csv in-vocabulary (FR) and reference filename list -f setting filename : load filename into task setting -h : show this help and exit -i filename : in-vocabulary (FR) filename list -j threads : number of concurrent jobs (default: 1) -l filename : log output file (default: .log) -n : normalize results (lower case, strip punctuation) -o filename : out-of-vocabulary (FA) filename list -s setting=value : override a task setting -t task : specify task filename (required) -u : count in-vocabulary FAs -v [-v [-v]] : increase verbosity -w : calculate word-error rate on in-vocabulary audio At least one of -i, -c, or -o is required. -c and -i cannot be used together. -c file format is two comma-separated filespecs ',' Settings are strings used as keys to query or change task behavior. Most frequently used for wake words and command sets is operating-point. Refer to the TrulyNatural SDK documentation for a complete list and descriptions of all supported settings. ``` - The files specified by the `-i` and `-o` options must contain exactly one audio file path per line, with no extraneous whitespace. The line separator is the newline character, `\n`. - Files specified by the `-c` option must be a comma-separated value (CSV) file with exactly one audio file path and reference file path per line, and no extraneous whitespace. Each line will have two comma-separated fields. The first field is an audio file, and the second field is a text file containing the reference (expected result). UTF-8 is supported. - Some combination of `-c`, `-i`, and/or `-o` must be specified. - `-c` and `-i` cannot be used together. - `-w` requres `-c`. - `-u` counts unexpected phrase spots in the in-vocabulary (FR) data towards the false accept total. This only has an effect for spotters that require a significant lead-in time to stabilize. This flag can only be used when testing wake words and commands. (cannot be used with -w). - The `-j` option determines the number of concurrent threads to start. For multi-core CPUs this can significantly speed up the evaluation. ## Example ```console % snsr-eval-batch -v -v -v -t alexa-fr.snsr \ -i inv.txt -o oov.txt -j 6 -s operating-point=10 Writing log to "alexa-fr.log" INV: 2612 files, 23.128 hr, 23:07:39.285 OOV: 686 files, 142.984 hr, 142:59:01.345 Total: 3298 files, 166.111 hr, 166:06:40.630 Using operating point 10. Available operating points: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20. 3298 files, 166.111 hr, 118 FA 0.83/hr, 3.33% FR, 2525 TA, 658.9x RT ``` - 3298 files processed. - 166.111 hours of audio processed. - 118 false accept spots, which is an FA rate of 0.83 per hour. - 3.33% false reject ratio. - 2525 true accept spots on in-vocabulary test audio. - 658.9 real-time factor. ```console % snsr-eval-batch -a -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \ -c stt-16kHz-en-general-quicktest-full.csv -w -n -j 6 999 files, 1.281 hr, 9174 Words, 833 Substitutions, 198 Insertions, 120 Deletions, 12.546% WER, 5.3 xRT ``` - 999 files processed. - 1.281 hours of audio processed. - 9164 total words in test. - 833 substitutions. - 198 insertions. - 120 deletions. - 12.546% Word Error Rate. - 5.3 real-time factor. ## Log file format `snsr-eval-batch` produces a log file in plain text format. Each line in this file follows the same pattern: `KEY [subkey] [detail]`. KEY | subkey | detail | notes -|-|-|- CMDFR | "filename" | reference | false reject, no matches detected in this file CMDSB | "filename" | start-ms end-ms "phrase" "reference" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | mismatch between command phrase and reference CMDTA | "filename" | start-ms end-ms "phrase" "reference" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | true-accept with reference phrase checking ERROR | error message | | unexpected error encountered FR+SBCOUNT | double | | total number of false-reject + substitution errors FR+SBRATIO | double | % | overall false-reject + substitution ratio FACOUNT | integer | | total number of false-accept spots FARATE | double | / hr | overall false-accept rate FRCOUNT | double | | total number of false-reject errors FRRATIO | double | % | overall false-reject ratio INFO | start-time | YYYY-MM-DD HH:MM:SS.sss UTC | job start time in UTC INFO | completion-time | YYYY-MM-DD HH:MM:SS.sss UTC | job end time in UTC INFO | duration | double | total job duration in seconds INFO | sdk-name | "TrulyHandsfree" or "TrulyNatural" | INFO | sdk-version | version-string | snsr-eval-batch SDK version INFO | command-line | command-line arguments | includes @c argv[0] INFO | operating-point | integer | selected operating point INFO | inv-files | integer | number of in-vocabulary (FR) test files INFO | inv-seconds | integer | seconds of in-vocabulary audio INFO | inv-hours | HHH:MM:SS.sss | inv-seconds as hours, minutes, seconds INFO | oov-files | integer | number of out-of-vocabulary (FA) test files INFO | oov-seconds |integer | seconds of out-of-vocabulary audio INFO | oov-hours | HHH:MM:SS.sss | oov-seconds as hours, minutes, seconds INFO | inv/oov-seconds |integer | seconds of OOV audio in FR test files INFO | inv/oov-hours | HHH:MM:SS.sss | inv/oov-seconds as hours, minutes, seconds INFO | rejected-files | integer | number of rejected files (not used in the test) INFO | real-time-factor | double | total duration of audio processed divided by the processing time INVFA | "filename" | start-ms end-ms "phrase" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | FA in in-vocabulary test file. This is a spot that happened during the `min-in-vocab-duration` lead-in period, or an additional, spurious, spot recognized in the in-vocabulary file. INVFR | "filename" | | false reject, no spot in this file INVTA | "filename" | start-ms end-ms "phrase" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | true accept INVTX | "filename" | N spots | more than one spot in this file OOVFA | "filename" | start-ms end-ms "phrase" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | FA in out-of-vocabulary test file REJECT | "filename" | reason | filename was rejected as unusable STTFR | "filename" | reference | false reject, no matches detected in this file STTSB | "filename" | start-ms end-ms "phrase" "reference" word-count, substitutions, additions, deletions, word-error-rate | mismatch between LVCSR/STT phrase and reference STTTA | "filename" | start-ms end-ms "phrase" "reference" word-count, substitutions, additions, deletions, word-error-rate | true-accept (no mismatch) between LVCSR/STT phrase and reference TACOUNT | integer | | total number of true-accept spots WER | double | % | overall word-error-rate WER_DELETIONS | integer | | total number of WER deletions WER_INSERTIONS | integer | | total number of WER insertions WER_SUBSTITUTIONS | integer | | total number of WER substitutions WER_WORDS | integer | | total number of WER words *[API]: Application Programming Interface *[FA]: False Accept: the recognizer triggered when the target phrase was not spoken *[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken *[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder *[SDK]: Software Development Kit *[STT]: Speech To Text: transformers with language model and CTC decoding *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/snsr-eval.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-eval/" --- # snsr-eval This tool evaluates / runs TrulyNatural SDK `snsr` model files. It supports all [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), except wake word [enrollment](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll) which is handled by [spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll) and [live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll). **Also see these related items:** [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-evalc) ## Usage ``` Evaluates/runs TrulyNatural SDK .snsr model files. usage: snsr-eval -t task [options] [wavefile ...] options: -a : Add tpl-vad-lvcsr to LVCSR and STT models -d directory : VAD audio output directory -f setting filename : load filename into task setting -g setting value : load string into task setting -i listFile : run evaluation on each filename in listFile -l [-l [-l]] : reduce verbosity -o out : output filename for VAD audio or listFile results -p [-p] : Enable pipeline profiling (experimental) -q setting : query a task setting -s setting=value : override a task setting -t task : specify task filename (required) -u filename : remove unused settings and save model to filename -v [-v [-v]] : increase verbosity Use a filename of - to read headerless linear 16-bit PCM little-endian audio from stdin. If you don't specify any wave files, snsr-eval uses live audio captured from the default audio device. The -d and -o options are mutually exclusive. The output directory must be writable. Audio files created by VAD segmentation are named /.wav Settings are strings used as keys to query or change task behavior. Most frequently used are operating-point for wake words and command sets, leading-silence and trailing-silence for VAD templates, partial-result-interval for LVCSR and STT, and stt-profile for STT models. Refer to the TrulyNatural SDK documentation for a complete list and descriptions of all supported settings. ``` ## Batch processing If you specify the `-i listFile` option, `snsr-eval` will evaluate the model on the filenames in `listFile`. This loads the model once and re-uses the [session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) instance for each evaluation, reducing overhead. It expects one filename per line. In batch processing mode, `snsr-eval` produces output in [tab-separated value][tsv] format. Each audio file in `listFile` has a corresponding result line in the output, unless the processing the audio file results in an error. Such errors are treated as warnings and printed to `stderr`. If you don't specify an output file with `-o` output goes to `stdout` instead. Output columns are, in order: * File index, starting at `1` * Audio filename * If there is a recognition [result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result): * Start alignment in ms, [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms) * End alignment in ms, [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms) * Recognition score, [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) * Result hypothesis, [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) * If there is an [nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent): * Intent name, [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) * Intent score, [nlu-intent-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-score) * Intent value, [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value) * For each NLU entity found: * Entity name, [nlu-entity-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name) * Entity score, [nlu-entity-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-score) * Entity value, [nlu-entity-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-value) ## Examples Fixed-phrase. ```console % snsr-eval -t ./model/spot-hbg-enUS-1.4.0-m.snsr hbg_2.wav hbg_7.wav 1200 1905 hello blue genie 3855 4575 hello blue genie ``` Fixed-phrase on default audio capture device. ```console % snsr-eval -v -t ./model/spot-hbg-enUS-1.4.0-m.snsr.snsr Using live audio from default capture device. ^C to stop. 3180 3885 (1.00 sv) hello blue genie 9000 9720 (1.00 sv) hello blue genie ^C ``` Enrolled user-defined phrase. ```console % snsr-eval -v -t ./three-users.snsr -s sv-threshold=0\ ./data/enrollments/armadillo-1-4-c.wav ./data/enrollments/armadillo-6-0.wav\ ./data/enrollments/terminator-2-5.wav ./data/enrollments/jackalope-1-4-c.wav 435 990 (0.89 sv) armadillo-1 5940 6630 (0.99 sv) terminator-2 8100 8610 (0.32 sv) jackalope-1 ``` Lower the speaker-verification threshold. ```console % snsr-eval -v -t ./three-users.snsr -s sv-threshold=0\ ./data/enrollments/jackalope-1-5.wav ./data/enrollments/jackalope-1-5-c.wav 270 840 (0.56 sv) jackalope-1 2130 2610 (0.33 sv) jackalope-1 ``` Recognize a list of audio files. ```console % find data -name \*.wav > audio-files.txt % snsr-eval -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr -o eval.tsv -i audio-files.txt Processing file 50 of 50, 100.00% ``` [tsv]: https://en.wikipedia.org/wiki/Tab-separated_values "Tab-separated values" *[API]: Application Programming Interface *[NLU]: Natural Language Understanding model *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/snsr-log-split.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-log-split/" --- # snsr-log-split Command-line splitter for logfiles generated by the debug template [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-debug). ## Usage ``` Splits spotter log files into an event log, captured audio data, and the source spotter model. usage: snsr-log-split logfile [logfile ...] options: -d directory : output directory (default: .) -v [-v [-v]] : increase verbosity ``` This will put an event log file (_.txt_), audio file (_.wav_), and model file (_.snsr_) into the current directory, or into the output directory if that was supplied. ## Example ```console % snsr-log-split -d out debug.log % ls out debug.snsr debug.txt debug.wav ``` This will create components of _debug.log_ in subdirectory _out/_ (which must already exist.) *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/spot-convert.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/spot-convert/" --- # spot-convert Command-line phrase spotter model conversion tool, targeting Sensory's deeply embedded DSP solutions running [THF Micro][]. **Also see these related items:** [spot-convert.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-convert.md#spot-convertc), [THF Micro][] ## Usage ``` Converts TrulyNatural SDK wake word models to THF Micro format. usage: spot-convert -t task [options] target options: -a : convert all operating-points -c : create .c output (in addition to .bin) -o output : full prefix for output filenames -p output-prefix : prefix for output filenames (default: task-target-) -q slotname : model slot prefix -s setting=value : override a task setting -t task : set a task filename (required) -v [-v [-v]] : increase verbosity Output filenames are determined by the model parameters: $(prefix) [-] [slot$(slotname)-] $(target)- $(version)- op$(operating-point)- {dev,prod}- {net,search}.{bin,c,h} where: prefix specified by the -p option, or taken from the filename of the task if -p isn't used. version is the oldest DSP library that can run this model. -dev- models are limited in runtime or number of recognition events and should not be used in products. -prod- models are not limited and ready for production use. Use the -o option to override this filename pattern to: $(prefix)[-]{net,search}.{bin,c,h} The -o and -a options are mutually exclusive. Output filenames are constrained to never start with "-" Settings are strings used as keys to query or change task behavior. Most frequently used for wake words and command sets is operating-point. Refer to the TrulyNatural SDK documentation for a complete list and descriptions of all supported settings. ``` ## Examples Convert a phrase spotter model to `pc38` format. This converts only the default operating point for the model, point 10: ```console % spot-convert -v -t model/spot-hbg-enUS-1.4.0-m.snsr pc38 operating-point: 10 production-ready: yes wrote acoustic model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op10-prod-net.bin" wrote search model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op10-prod-search.bin" wrote search header to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op10-prod-search.h" ``` Convert a phrase spotter model to `pc38` format, overriding base filename: ```console % spot-convert -v -v -t model/spot-hbg-enUS-1.4.0-m.snsr -p embedded pc38 target: pc38 basename: embedded-pc38- operating-point: 10 production-ready: yes wrote acoustic model (bin) to "embedded-pc38-3.4.0-op10-prod-net.bin" wrote search model (bin) to "embedded-pc38-3.4.0-op10-prod-search.bin" wrote search header to "embedded-pc38-3.4.0-op10-prod-search.h" ``` Convert a phrase spotter model to `pc38` format, overriding filename metadata: ```console % spot-convert -v -v -t model/spot-hbg-enUS-1.4.0-m.snsr -o embedded pc38 target: pc38 basename: embedded- operating-point: 10 production-ready: yes wrote acoustic model (bin) to "embedded-net.bin" wrote search model (bin) to "embedded-search.bin" wrote search header to "embedded-search.h" ``` Convert all operating points, produce C code in addition to the binaries: ```console % spot-convert -v -a -c -t model/spot-hbg-enUS-1.4.0-m.snsr pc38 operating-point: 1 production-ready: yes wrote acoustic model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-net.bin" wrote search model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-search.bin" wrote search header to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-search.h" wrote acoustic model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-net.c" wrote search model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-search.c" operating-point: 2 production-ready: yes wrote acoustic model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-net.bin" wrote search model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-search.bin" wrote search header to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-search.h" wrote acoustic model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-net.c" wrote search model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-search.c" ... etc ... ``` [THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation" *[API]: Application Programming Interface *[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "tools/spot-enroll.md" canonical_url: "https://doc.sensory.com/tnl/7.8/tools/spot-enroll/" --- # spot-enroll Command-line wake word enrollment. **Also see these related items:** [spot-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-enroll.md#spot-enrollc) ## Usage ``` Enrolls TrulyNatural SDK wake words on audio files. usage: spot-enroll -t task [options] [+user1 file1 [-c] file2 ...] [+user2 ...] options: -a adaptedfile : adapted enrollment context output filename -c file : recording contains trailing context -e enrolledfile : enrollment context output filename -o out : enrolled model output filename (default: enrolled-sv.snsr) -s setting=value : override a task setting -t task : specify task filename (required) -v [-v [-v]] : increase verbosity Settings are strings used as keys to query or change task behavior. Most frequently used for enrollment is accuracy, which takes a value between 0 and 1. Refer to the TrulyNatural SDK documentation for a complete list and descriptions of all supported settings. ``` ## Examples Enroll two users. ```console % spot-enroll -t ./model/udt-universal-3.67.1.0.snsr\ +armadillo-1\ ./data/enrollments/armadillo-1-0.wav\ ./data/enrollments/armadillo-1-1.wav\ ./data/enrollments/armadillo-1-2.wav\ ./data/enrollments/armadillo-1-3.wav\ +jackalope-4\ ./data/enrollments/jackalope-4-0.wav\ ./data/enrollments/jackalope-1-1.wav\ ./data/enrollments/jackalope-1-2.wav\ ./data/enrollments/jackalope-1-3.wav ``` Enroll a single user, with two enrollment recordings that include trailing context: ```console % spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\ +armadillo-1\ ./data/enrollments/armadillo-1-0.wav \ -c ./data/enrollments/armadillo-1-0-c.wav\ ./data/enrollments/armadillo-1-1.wav\ -c ./data/enrollments/armadillo-1-1-c.wav Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Enroll a user phrase, save the enrollment context to file. ```console % spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\ -e armadillo-1-enrollments.snsr\ +armadillo-1\ ./data/enrollments/armadillo-1-0.wav\ ./data/enrollments/armadillo-1-1.wav\ ./data/enrollments/armadillo-1-2.wav\ ./data/enrollments/armadillo-1-3.wav Enrollment context saved to "armadillo-1-enrollments.snsr" Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Enroll another user phrase, save the enrollment context to file. ```console % spot-enroll -v -v -t ./model/udt-universal-3.67.1.0.snsr\ -e jackalope-1-enrollments.snsr\ +jackalope-1\ ./data/enrollments/jackalope-1-0.wav\ ./data/enrollments/jackalope-1-1.wav\ ./data/enrollments/jackalope-1-2.wav\ ./data/enrollments/jackalope-1-3.wav Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-0.wav". Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-1.wav". Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-2.wav". Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-3.wav". jackalope-1: 4 enrollments. Enrollment context saved to "jackalope-1-enrollments.snsr" Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Combine two previously saved enrollment contexts with a third enrollment, save the combined enrollment context. Speed up the adaptation by reducing the accuracy. ```console % spot-enroll -v -v -v -t ./model/udt-universal-3.67.1.0.snsr\ -t armadillo-1-enrollments.snsr\ -t jackalope-1-enrollments.snsr\ -e combined-enrollments.snsr -s accuracy=0.1\ +terminator-2\ ./data/enrollments/terminator-2-0.wav\ ./data/enrollments/terminator-2-1.wav\ ./data/enrollments/terminator-2-2.wav\ ./data/enrollments/terminator-2-3.wav Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-0.wav". Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-1.wav". Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-2.wav". Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-3.wav". armadillo-1: 4 enrollments. jackalope-1: 4 enrollments. terminator-2: 4 enrollments. Enrollment context saved to "combined-enrollments.snsr" Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Re-enroll with full accuracy, save to three-users.snsr. ```console % spot-enroll -v -v -v -t ./model/udt-universal-3.67.1.0.snsr\ -t combined-enrollments.snsr -o three-users.snsr armadillo-1: 4 enrollments. jackalope-1: 4 enrollments. terminator-2: 4 enrollments. Adapting: 100% complete. Enrolled model saved to "three-users.snsr" ``` Delete an enrollment from a saved enrollment context. ```console % spot-enroll -v -v -t ./model/udt-universal-3.67.1.0.snsr\ -t combined-enrollments.snsr -e two-enrollments.snsr\ -s delete-user=jackalope-1 armadillo-1: 4 enrollments. terminator-2: 4 enrollments. Enrollment context saved to "two-enrollments.snsr" Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Enroll two users separately, save adapted contexts, then combine the saved contexts without re-adapting. ```console % spot-enroll -v -v -t ./model/udt-universal-3.67.1.0.snsr\ -a armadillo-1-adapted.snsr\ +armadillo-1\ ./data/enrollments/armadillo-1-0.wav\ ./data/enrollments/armadillo-1-1.wav\ -c ./data/enrollments/armadillo-1-0-c.wav\ -c ./data/enrollments/armadillo-1-1-c.wav Enrolling user "armadillo-1" from file "./data/enrollments/armadillo-1-0.wav". Enrolling user "armadillo-1" from file "./data/enrollments/armadillo-1-1.wav". Enrolling user "armadillo-1" with context from file "./data/enrollments/armadillo-1-0-c.wav". Enrolling user "armadillo-1" with context from file "./data/enrollments/armadillo-1-1-c.wav". armadillo-1: 4 enrollments. Enrollment context saved to "armadillo-1-adapted.snsr" Adapting: 100% complete. Adapted enrollment context saved to "armadillo-1-adapted.snsr" Enrolled model saved to "enrolled-sv.snsr" % spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\ -a jackalope-1-adapted.snsr\ +jackalope-1\ ./data/enrollments/jackalope-1-0.wav\ ./data/enrollments/jackalope-1-1.wav\ -c ./data/enrollments/jackalope-1-0-c.wav\ -c ./data/enrollments/jackalope-1-1-c.wav Enrollment context saved to "jackalope-1-adapted.snsr" Adapting: 100% complete. Adapted enrollment context saved to "jackalope-1-adapted.snsr" Enrolled model saved to "enrolled-sv.snsr" % spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\ -t ./armadillo-1-adapted.snsr\ -t ./jackalope-1-adapted.snsr Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` Load adapted contexts and force a re-adapt. ```console % spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\ -t ./armadillo-1-adapted.snsr\ -t ./jackalope-1-adapted.snsr\ -s re-adapt=1 Adapting: 100% complete. Enrolled model saved to "enrolled-sv.snsr" ``` *[API]: Application Programming Interface *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology --- source_path: "upgrade.md" canonical_url: "https://doc.sensory.com/tnl/7.8/upgrade/" --- # How to upgrade The TrulyHandsfree and TrulyNatural SDKs are fully backwards-compatible with application code and models from earlier releases. To upgrade to a new version or variant (for example from TrulyHandsfree to TrulyNatural), replace the library files and rebuild your application. ## C applications Use these steps for all applications that link against `libsnsr.a`. This includes those that use other application languages via an adapter, such as [iOS](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-ios). 1. Replace both `libsnsr.a` and `snsr.h` with their updated versions. - Find `libsnsr.a` appropriate for your target platform in the _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/_ directory, and `snsr.h` in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/include/_ - [new](https://doc.sensory.com/tnl/7.8/api/inference.md#new) does a runtime check to verify that the library and its header are in sync; if they're not, it will return the [LIBRARY_HEADER](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) error code. - For iOS, update the [XCFramework][] from _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/ios/snsr.xcframework_ 2. If you are using [custom library initialization](https://doc.sensory.com/tnl/7.8/faq.md#reduce-code-size) you must recreate `snsr-custom-init.c` using [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) from the new SDK version. 3. Rebuild your application. ## Android and Java applications If you're using the recommended [Android build recipe](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-android): 1. Edit `gradle.properties` and update `SNSR_REPOSITORY`, `SNSR_LIB_VERSION`, and possibly `SNSR_LIB_TYPE`. - The SDK installers publish the library artifacts in `mavenLocal()` too. If you're using these, there's no need to update `SNSR_REPOSITORY`. 2. Rebuild your application. [XCFramework]: https://developer.apple.com/documentation/xcode/creating-a-multi-platform-binary-framework-bundle *[API]: Application Programming Interface *[SDK]: Software Development Kit *[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology