The default behavior if `panicFunc` is `NULL` is to print an
error message to `stderr` and then abort the process. This is
undesirable on small embedded platforms with either no OS, or a very
lightweight RTOS, that run from a single memory image.
If `panicFunc` is called, all TrulyNatural SDK handles are invalid and must
be discarded. An effective way to do this is to use a custom [heap allocator](https://doc.sensory.com/tnl/7.8/api/library-config.md#heap-allocators),
which allows all open handles to be abandoned by reclaiming the heap segment.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_PANIC_FUNC,
SnsrPanic panic
);
```
- **Input parameter:** `panic`: [panic](https://doc.sensory.com/tnl/7.8/api/library-config.md#panic) callback function.
**Example**
Example:
```c
#include
#include
#include
static jmp_buf PanicJmp;
static void
panicFunc(const char *format, va_list a)
{
fprintf(stderr, "\nPANIC: ");
vfprintf(stderr, format, a);
fprintf(stderr, "\n\n");
longjmp(PanicJmp, SNSR_RC_NO_MEMORY);
}
// In main() before any API calls that are not snsrConfig()
int r;
snsrConfig(SNSR_CONFIG_PANIC_FUNC, panicFunc);
if ((r = setjmp(PanicJmp))) {
snsrTearDown();
// handle out-of-memory case. Abandon heap, re-initialize.
}
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [heap allocators](https://doc.sensory.com/tnl/7.8/api/library-config.md#heap-allocators)
**`STT_SUPPORT`**
Library Speech-To-Text support.
[config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports Speech-To-Text
models, or [NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_not_supported) if STT support is not available.
You may retrieve this value even if active library handles exist.
**Note**
You must call [new](https://doc.sensory.com/tnl/7.8/api/inference.md#new) at least once before calling `snsrConfig(SNSR_CONFIG_STT_SUPPORT)`.
If you do not, this function will return [NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_not_supported) even if the SDK
does include support for Speech-To-Text.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_STT_SUPPORT
);
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [stt-support](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#stt-support)
**`THREAD_SUPPORT`**
Library multithreading support.
[config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports running multi-threaded models,
or [NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_not_supported) if thread support is not available.
You may retrieve this value even if active library handles exist.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_THREAD_SUPPORT
);
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [thread-support](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#thread-support)
**`SECURITY_CHIP`**
Hardware security device communication.
Use iff recommended by Sensory.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_SECURITY_CHIP,
SnsrChipComms commsFunc
);
```
- **Input parameter:** `commsFunc`: [chipComms](https://doc.sensory.com/tnl/7.8/api/library-config.md#chipcomms) callback function.
**Example**
Example:
```c
uint32_t *chipComms(uint32_t *in) {
// communicate with hardware
}
// ...
snsrConfig(SNSR_CONFIG_SECURITY_CHIP, chipComms);
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config)
**`CLOCK_FUNC`**
Set high-resolution monotonic time function and resolution.
Sets a function that must return a high-resolution monotonically-increasing
value that measures clock time. This is used to limit the maximum amount
of time spent in any one call to [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push) with [push-duration-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#push-duration-limit),
and for recognition pipeline profiling.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_CLOCK_FUNC,
SnsrClock clockFunc,
double resolution
);
```
- **Input parameter:** `clockFunc`: [clock](https://doc.sensory.com/tnl/7.8/api/library-config.md#clock) function, returns number of clock ticks.
- **Input parameter:** `resolution`: The number of ticks per second.
**Example**
Example:
```c
#include
#include
static uint64_t clockFunc(void) {
struct timespec t;
clock_gettime(CLOCK_MONOTONIC, &t);
return (uint64_t)t.tv_sec * 1e9 + (uint64_t)t.tv_nsec;
}
int main(int argc, char *argv[]) {
snsrConfig(SNSR_CONFIG_CLOCK_FUNC, clockFunc, 1e9);
// ...
}
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [profile](https://doc.sensory.com/tnl/7.8/api/inference.md#profile), [push](https://doc.sensory.com/tnl/7.8/api/inference.md#push)
**`LICENSE`**
Apply a software license key.
This overrides the software license key embedded in the
TrulyNatural SDK library. Use this to extend the expiration date
or to enable additional features.
[config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports overriding
license keys and the license key format is valid,
[LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported) if the SDK port does not
support override keys,
and [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled) if the existing library
license key does not enable the override feature.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_LICENSE,
const char *key,
const char *secret
);
```
- **Input parameter:** `key`: This a cryptographically signed license key provided by Sensory. Use `NULL` to disable an existing key.
- **Input parameter:** `secret`: Secret string required to validate the key signature. Can be `NULL`. Provided by Sensory.
**Example**
This key expired on 2026-01-01 and disables all features. Do not use it your own code.
```c
const char *key =
"eyJ2ZXIiOjEsInBsZCI6ImV5SnNhV05sYm5ObFpTSTZJbE5"
"sYm5OdmNua2dRMjl1Wm1sa1pXNTBhV0ZzSUNBZ0lDSXNJbT"
"F2WkhWc1pYTWlPaUl3SWl3aVpYaHdJam9pTmprMU5qSTVPR"
"EFpTENKamJHbGxiblJKWkNJNklqQWlMQ0ppZFdsc1pDSTZP"
"VFo5Iiwic2lnIjoiNjJLa0I2K2Nvdi9Fd2Y2eGppdDNlSWg"
"xZDVrR1BCYmo3N3BLUWVqU3ZQSkg1Z0RqVGd6VWtOSCtBak"
"diMTcwS2VUNThNN1laQmkwcG1lTEtGNWswRFE9PSJ9";
const char *secret =
"019bb450-caa7-7c2a-b796-960f7d61dc2a";
snsrConfig(SNSR_CONFIG_LICENSE, key, secret);
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [license keys](https://doc.sensory.com/tnl/7.8/reference/overview.md#license-keys)
**`LICENSE_INFO`**
Inspect an override license key field.
Returns the string value for the specified override software
license key field.**Private function**
Do not use this without explicit instructions from Sensory.
[config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports overriding
license keys and the license key format is valid,
[LICENSE_OVERRIDE_NOT_VALID](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) if the override key does
not exist or did not pass validation,
[LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported) if the SDK port does not
support override keys,
and [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled) if the existing library
license key does not enable the override feature.
The returned string value is reference-counted. You must call
[release](https://doc.sensory.com/tnl/7.8/api/heap.md#release) on it before it goes out of scope or a memory leak will result.
The returned value will be `NULL` if the named field does not exist in the license key.
**snsrConfig() parameters**
```c
SNSR_API SnsrRC
snsrConfig(
SNSR_CONFIG_LICENSE,
const char *field,
const char *value
);
```
- **Input parameter:** `field`: The name of an override license key field.
- **Output parameter:** `value`: The string value for `field`. Reference-counted. Must be [release](https://doc.sensory.com/tnl/7.8/api/heap.md#release)d.
**Example**
```c
const char *expires;
SnsrRC r = snsrConfig(SNSR_CONFIG_LICENSE_INFO,
"exp", &expires);
```
**Also see these related items:** [config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config), [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license), [license keys](https://doc.sensory.com/tnl/7.8/reference/overview.md#license-keys)
**`LICENSE_SUPPORT`**
Software license key override support.
[config](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) returns [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok) if the SDK supports overriding
license keys, [LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported)
if the SDK port does not include support for overriding keys,
and [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled) if the
existing library license key does not enable the override feature.
- •{ .lg .middle } **What's in the SDK**
---
[TrulyHandsfree][thf] (wake words and commands), [TrulyNatural
Lite][tnl-lite] (LVCSR), and [TrulyNatural STT][tnl-stt]
(transformer transcription) — three variants, one API. Compare
features, supported platforms, and host requirements.
:octicons-arrow-right-24: [Reference overview](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-overview)
- •{ .lg .middle } **Get started**
---
Install the SDK and try wake words, LVCSR, or STT with the
command-line tools (`snsr-eval`, `snsr-edit`). When you are ready
to embed, build a Session API sample for your platform — embedded,
mobile, or desktop.
:octicons-arrow-right-24: [Quick start](https://doc.sensory.com/tnl/7.8/getting-started/index.md#getting-started)
:octicons-arrow-right-24: [Your first program](https://doc.sensory.com/tnl/7.8/getting-started/your-first-program.md#your-first-program)
- •{ .lg .middle } **API reference**
---
Native [C][] API plus [Java][] and [Python][] bindings (Android
Kotlin uses the Java binding directly), iOS via a [bridging header][].
Function-level documentation, types, and error codes.
:octicons-arrow-right-24: [API reference](https://doc.sensory.com/tnl/7.8/api/index.md#api-reference)
## More
[Changelog](https://doc.sensory.com/tnl/7.8/changes/index.md#v7-changes)
- User-visible changes by SDK version.
[Command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools)
- `snsr-eval`, `snsr-build`, and friends.
[FAQ](https://doc.sensory.com/tnl/7.8/faq.md#faq)
- Frequently Asked Questions.
[Coding agents](https://doc.sensory.com/tnl/7.8/agent-tools.md#coding-agents)
- Point Claude, Cursor, Aider, Continue, or other AI coding tools at
this doc set so they can answer SDK questions without extra setup.
[Contact information](https://doc.sensory.com/tnl/7.8/contact.md#contact)
- How to reach Sensory engineering and sales.
[Licenses](https://doc.sensory.com/tnl/7.8/licenses/index.md#sensory-sdk-license)
- Legal agreements.
[bridging header]: https://developer.apple.com/documentation/swift/importing-objective-c-into-swift "Importing Objective-C into Swift"
[C]: https://en.wikipedia.org/wiki/C_(programming_language) "C programming language"
[Java]: https://en.wikipedia.org/wiki/Java_(programming_language) "Java programming language"
[Python]: https://en.wikipedia.org/wiki/Python_(programming_language) "Python programming language"
[thf]: https://www.sensory.com/wake-word/ "Low Power Wake Words & Phrase Recognition Engine"
[tnl-lite]: https://www.sensory.com/natural-language-understanding/ "Large Vocabulary Continuous Speech Recognition (LVCSR) with Dynamic Natural Language Understanding"
[tnl-stt]: https://www.sensory.com/embedded-speech-to-text/ "Embedded Speech To Text"
*[API]: Application Programming Interface
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[SDK]: Software Development Kit
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "licenses/index.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/licenses/"
---
# Sensory SDK license
## Sensory Inc. Software Developer's Kit ("SDK")
**Note:**
NOTICE TO USER: PLEASE READ THIS CONTRACT CAREFULLY. BY USING ALL OR ANY PORTION OF THE SDK YOU ACCEPT ALL THE TERMS AND CONDITIONS OF THIS LICENSE AGREEMENT, INCLUDING, IN PARTICULAR THE LIMITATIONS ON: USE CONTAINED IN SECTIONS 2, 3 AND 4; WARRANTY IN SECTION 7; AND LIABILITY IN SECTION 8. YOU AGREE THAT THIS LICENSE AGREEMENT IS ENFORCEABLE LIKE ANY WRITTEN NEGOTIATED AGREEMENT SIGNED BY YOU. IF YOU DO NOT AGREE, DO NOT USE THIS SDK. IF YOU ACQUIRED THE SDK ON TANGIBLE MEDIA (FOR EXAMPLE, CD-ROM) WITHOUT AN OPPORTUNITY TO REVIEW THIS LICENSE, AND YOU DO NOT ACCEPT THIS LICENSE AGREEMENT, YOU MAY NOT USE THE SDK.
### 1. Definitions
"SDK" means all of the contents of the files, disk(s), CD-ROM(s) or other media with which this License Agreement is provided, including but not limited to (i) Sample Code; (ii) Header File Information; (iii) Redistributable Code, (iv) Documentation; and (v) any upgrades, modified versions, updates, and/or additions thereto, if any, provided to You by Sensory. "Sample Code" means sample software in source code format designated in the Documentation as "Sample Code." "Header File Information" means any header files (*.h files) supplied in connection with the SDK, including without limitation any related information detailing contents of header files. "Redistributable Code" means certain object code files designated in the Documentation as "Redistributable Code." "Documentation" means explanatory materials supplied with the SDK or made available online on Sensory public web pages related to the SDK. "Sensory" means Sensory. Inc., a California corporation, 3150 De La Cruz Blvd., Suite 120, Santa Clara, CA 95054. "Sensory Software" means the generally commercially available versions of Sensory TrulyNatural SDK. "Developer Programs" means Your application programs that are designed to function with Sensory Software products. "Developer," "You," and "Your" refer to any person or entity accessing or using this SDK, or any component thereof. "End User License Agreement" means an end user license agreement that provides a (a) limited, nonexclusive right to use the subject Developer Program with no further right to reproduce (except for archival and/or backup copies permitted by law) and/or distribute the subject Developer Program, (b) prohibition against distributing, selling, sublicensing, renting, loaning or leasing the subject Developer Program, (c) prohibition against reverse engineering, decompiling, disassembling or otherwise attempting to discover the source code of the subject Developer Program that is substantially similar to that set forth in Section 3 below, (d) statement that You and your suppliers retain all right, title and interest in the subject Developer Program that is substantially similar to that set forth as Section 5 below, (e) statement that Your suppliers disclaim all warranties, conditions, representations or terms with respect to the subject Developer Program substantially similar to the disclaimer set forth as Section 7 below, and (f) limit of liability substantially similar to that set forth as Section 8 below for the benefit of Your suppliers.
### 2. License
Subject to the terms and conditions of this License Agreement, Sensory grants You a non-exclusive, nontransferable, royalty-free license to (a) use the SDK for the sole purpose of internally developing Developer Programs, (b) reproduce and modify Sample Code as a component of Developer Programs that add significant and primary functionality to the Sample Code, (c) reproduce Redistributable Code solely as a component of Developer Programs that add significant and primary functionality to the Redistributable Code and (d) distribute Sample Code and/or Redistributable Code in object code form only as a component of Developer Programs that add significant and primary functionality to the Sample Code and/or Redistributable Code provided that (i) You distribute such object code under the terms and conditions of an End User License Agreement, (ii) You include a copyright notice reflecting the copyright ownership of Developer in such Developer Programs, (iii) You shall be solely responsible to Your customers for any update or support obligation or other liability which may arise from such distribution, (iv) You shall not make any statements that Your Developer Product is "certified," or that its performance is guaranteed, by Sensory, and (v) You do not use Sensory's name or trademarks to market Your Developer Product without written permission of Sensory. Any modified or merged portion of the Sample Code, and/or merged portion of the Redistributable Code, IS subject to this License Agreement. Use of Sensory Software and/or any other Sensory application program is subject to the applicable end user license agreement for such application software even if such Sensory Software is supplied to You in connection with this License Agreement. You may make a limited number of copies of the Documentation to be used by Your employees or consultants for internal development purposes and not for general business purposes or for distribution by any means, and such employees or consultants shall be subject to this License Agreement. You may distribute up to five instances of Your Developer Program with Sensory Software to third parties under this agreement. You may distribute more than five instances of Sensory Software with Your Developer Programs only under separate license from Sensory. Sensory is under no obligation to provide any support under this License Agreement, including upgrades or future versions of the SDK, Sensory Software and/or any component thereof, to Developer, end users, or to any other party. Further developer support, software licensing, trademark licensing and trademark usage information is available through www.Sensoryinc.com.
### 3. Restrictions
Except for the limited distribution rights as provided in Section 2 above with respect to Sample Code, Redistributable Code, and Developer Programs, You may not distribute, sell, sublicense, rent, loan, or lease the SDK, Sensory Software, and/or any component thereof to any third party. You also agree not to reverse engineer, decompile, disassemble or otherwise attempt to discover the source code of the SDK, Sensory Software and/or any component thereof except to the extent (i) you may be expressly permitted to decompile under applicable law, (ii) it is essential to do so in order to achieve operability of the SDK or Sensory Software with another software program, and (iii) you have first requested Sensory to provide the information necessary to achieve such operability and Sensory has not made such information available. Sensory has the right to impose reasonable conditions and to request a reasonable fee before providing such information. Any information supplied by Sensory or obtained by you, as permitted hereunder, may only be used by you for the purpose described herein and may not be disclosed to any third party or used to create any software which is substantially similar to the expression of the SDK and/or Sensory Software.
### 4. Confidential Information
You agree not to disseminate or in any way disclose Header File Information to any person, firm or business except for Your employees who need to know such Header File Information and who have previously agreed to be bound by a confidentiality obligation consistent with the obligation set forth in this Section 4. Further, You agree to treat the Header File Information with the same degree of care as You accord to Your own confidential information, but in any event no less than reasonable care. Your obligations under this section with respect to the Header File Information shall terminate when You can document that such Header File Information was (i) in the public domain at or subsequent to the time it was communicated to You by Sensory through no fault of yours, (ii) developed by Your employees or agents independently of and without reference to any information communicated to You by Sensory; or (iii) disclosed in response to a valid order by a court or other governmental body, as otherwise required by law, or as necessary to establish the rights of either party under this License Agreement.
### 5. Proprietary Rights
You agree to protect Sensory's copyright and other ownership interests in all items in this SDK. You agree that all copies of items in this SDK reproduced for any reason by You will contain the same copyright, trademark, and other proprietary notices as appropriate and appear on or in the master items delivered by Sensory in this SDK. Sensory and/or its suppliers retain all right, title and ownership throughout the world in the intellectual property embodied within the SDK. Except as stated herein, this License Agreement does not grant You any rights to patents, copyrights, trade secrets, trademarks, or any other rights in respect to the items in this SDK.
### 6. Term
This License Agreement is effective until terminated. Sensory has the right to terminate Your License immediately if You fail to comply with any term of this License Agreement. Upon any such termination, You must (a) return all full and partial copies of the items in this SDK immediately to Sensory and (b) discontinue distribution of any Sample Code and/or Redistributable Code. Sections 1, 3, 4, 5, 6, 7, 8, 9, 11 and 12 shall survive any termination and/or expiration of this License Agreement.
### 7. Disclaimer of Warranty
Sensory licenses the SDK to You on an "AS IS" basis and without warranty of any kind. SENSORY AND ITS SUPPLIERS DO NOT AND CANNOT WARRANT THE PERFORMANCE OR RESULTS YOU MAY OBTAIN BY USING THE SDK. EXCEPT FOR ANY WARRANTY, CONDITION, REPRESENTATION OR TERM TO THE EXTENT TO WHICH THE SAME CANNOT OR MAY NOT BE EXCLUDED OR LIMITED BY LAW APPLICABLE TO YOU IN YOUR JURISDICTION, SENSORY AND ITS SUPPLIERS MAKE NO WARRANTIES, CONDITIONS, REPRESENTATIONS OR TERMS, EXPRESS OR IMPLIED, WHETHER BY STATUTE, COMMON LAW, CUSTOM, USAGE OR OTHERWISE AS TO THE SDK OR ANY COMPONENT THEREOF, INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT OF THIRD PARTY RIGHTS, INTEGRATION, MERCHANTABILITY, SATISFACTORY QUALITY OR FITNESS FOR ANY PARTICULAR PURPOSE. Some states or provinces do not allow the exclusion of implied warranties so the above limitations may not apply to You. You may have rights that vary from jurisdiction to jurisdiction. For further warranty information, You may contact the Sensory at the address provided above.
### 8. Limitation of Liability
IN NO EVENT WILL SENSORY OR ITS SUPPLIERS BE LIABLE TO YOU FOR ANY DAMAGES, CLAIMS OR COSTS WHATSOEVER ARISING FROM THIS LICENSE AGREEMENT AND/OR YOUR USE OF THE SDK OR ANY COMPONENT THEREOF, INCLUDING WITHOUT LIMITATION ANY CONSEQUENTIAL, INDIRECT, INCIDENTAL DAMAGES, OR ANY LOST PROFITS OR LOST SAVINGS, EVEN IF AN SENSORY REPRESENTATIVE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH LOSS, DAMAGES, CLAIMS OR COSTS OR FOR ANY CLAIM BY ANY THIRD PARTY. THE FOREGOING LIMITATIONS AND EXCLUSIONS APPLY TO THE EXTENT PERMITTED BY APPLICABLE LAW IN YOUR JURISDICTION. SENSORY'S AGGREGATE LIABILITY AND THAT OF ITS SUPPLIERS UNDER OR IN CONNECTION WITH THIS LICENSE AGREEMENT SHALL BE LIMITED TO FIFTY U.S. DOLLARS ($50.00). Nothing contained in this License Agreement limits Sensory's liability to You in the event of death or personal injury resulting from Sensory's negligence or for the tort of deceit (fraud). Sensory is acting on behalf of its suppliers for the purpose of disclaiming, excluding and/or limiting obligations, warranties and liability as provided in this License Agreement, but in no other respects and for no other purpose.
### 9. Indemnification
You agree to defend, indemnify, and hold Sensory and its suppliers harmless from and against any claims or lawsuits, including attorneys' reasonable fees, that arise or result from the use or distribution of Developer Programs, provided that Sensory gives You prompt written notice of any such claim, tenders to You the defense or settlement of such a claim at Your expense, and cooperates with You, at Your expense, in defending or settling such claim.
### 10. Government Regulations
You agree that any Developer Program that includes Sample Code and/or Redistributable Code (i) will include in its license agreement a reference to applicable U.S. Government regulations which control licensing of software and (ii) will not be shipped, transferred, or exported into any country or used in any manner prohibited by the United States Export Administration Act or any other export laws, restrictions or regulations (collectively the "Export Laws"). In addition, if any part of the SDK is identified as export controlled items under the Export Laws, you represent and warrant that you are not a citizen, or otherwise located within, an embargoed nation (including without limitation Iran, Iraq, Syria, Sudan, Libya, Cuba, North Korea and Serbia) and that you are not otherwise prohibited under the Export Laws from receiving the SDK. All rights to use the SDK are granted on condition that such rights are forfeited if you fail to comply with the terms of this License Agreement.
### 11. Governing Law
This License Agreement will be governed by and construed in accordance with the substantive laws in force in the State of California. The courts of Santa Clara County, California shall each have exclusive jurisdiction over all disputes relating to this License Agreement. This License Agreement will not be governed by the conflict of law rules of any jurisdiction or the United Nations Convention on Contracts for the International Sale of Goods, the application of which is expressly excluded.
### 12. General
You may not assign Your rights or obligations granted under this License Agreement without the prior written consent of Sensory. None of the provisions of this License Agreement shall be deemed to have been waived by any act or acquiescence on the part of Sensory, its agents, or employees, but only by an instrument in writing signed by an authorized signatory of Sensory. It is expressly agreed that a breach of Section 3 or 4 of this License Agreement will cause irreparable harm to Sensory and that a remedy at law will be inadequate. Therefore, in addition to any and all remedies available at law, Sensory will be entitled to seek an injunction or other equitable remedies in all legal proceedings in the event of any threatened or actual violation thereof. When conflicting language exists between this License Agreement and any other agreement included in this SDK (except for the Integration Key License Agreement or any agreement supplied with Sensory Software), this License Agreement shall supersede. If either Sensory or Developer employs attorneys to enforce any rights arising out of or relating to this License Agreement, the prevailing party shall be entitled to recover reasonable attorneys' fees. You acknowledge that You have read this License Agreement, understand it, and that it is the complete and exclusive statement of Your agreement with Sensory which supersedes any prior agreement, oral or written, between Sensory and You with respect to the licensing to You of this SDK. No variation of the terms of this License Agreement will be enforceable against Sensory unless Sensory gives its express consent in a writing signed by an authorized signatory of Sensory.
Sensory, TrulyHandsfree, TrulyNatural, and the Sensory logo, are either trademarks or registered trademarks of Sensory, Inc. in the United States and/or other countries.
*[ROM]: Read-Only Memory, typically nonvolatile flash memory
*[SDK]: Software Development Kit
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "licenses/oss.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/licenses/oss/"
---
# Open Source licenses
One or more of the libraries included in this TrulyNatural SDK uses
third-party Open Source components with [permissive license agreements][oss-permissive].
See the _README\*.md_ files in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib//_
for library-specific details.
[oss-permissive]: https://en.wikipedia.org/wiki/Permissive_software_license
You can omit all Open Source Software from a TrulyNatural binary by:
- Compiling with `-DSNSR_OMIT_OSS_COMPONENTS` (see [Compile-time macros § SNSR_OMIT_OSS_COMPONENTS](https://doc.sensory.com/tnl/7.8/api/compile-macros.md#snsr-omit-oss-components))
- or by using custom initialization with models that do not require
these components. See [model:ids](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#modelids), [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit), and
[reduce code size](https://doc.sensory.com/tnl/7.8/faq.md#reduce-code-size).
Query [oss-components](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#oss-components) to determine which of these modules are linked
into an application.
### Details: [WebRTC](https://webrtc.googlesource.com/src/+/refs/heads/main/LICENSE)
### Details: [ONNX Runtime](https://github.com/microsoft/onnxruntime/blob/v1.21.1/LICENSE)
### Details: [ONNX Runtime dependencies](https://github.com/microsoft/onnxruntime/blob/v1.21.1/ThirdPartyNotices.txt)
[oss-permissive]: https://en.wikipedia.org/wiki/Permissive_software_license "Grants use rights, forbids almost nothing"
*[API]: Application Programming Interface
*[OSS]: Open-source software
*[SDK]: Software Development Kit
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "models/downloads.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/downloads/"
---
# Downloads _(STT only)_
The following STT models are available for download.
These are compatible with TrulyNatural STT SDK [7.7.0](https://doc.sensory.com/tnl/7.8/changes/index.md#v7.7.0) and later.
Contact your account representative or [Sensory sales](https://doc.sensory.com/tnl/7.8/contact.md#contact) for
additional languages and customizations.
**Filename key:**
`opt-vg-vad-stt-`
- These are pipelines made from the [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr) template with
a US English "Voice Genie" wake word in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) and an STT recognizer
in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
`-B`
- Model includes an NLU component that identifies intents and entities.
`-pnc`
- Model includes punctuation and capitalization.
`-slm`
- Model includes a small generative language model.
Larger models are more accurate but also require more CPU cycles.
| Language { data-sort-default } | Domain | Size in MiB { data-sort-method="number" } | Model |
|:---------------------------------|:-----------|--------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------|
| English (US) | automotive | 226 | [opt-vg-vad-stt-enUS-automotive-large-1.3.14-B-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-automotive-large-1.3.14-B-pnc_66.snsr) |
| English (US) | automotive | 91 | [opt-vg-vad-stt-enUS-automotive-medium-2.3.14-B-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-automotive-medium-2.3.14-B-pnc_66.snsr) |
| English (US) | automotive | 49 | [opt-vg-vad-stt-enUS-automotive-small-2.3.14-B-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-automotive-small-2.3.14-B-pnc_66.snsr) |
| English (US) | general | 11 | [opt-vg-vad-stt-enUS-general-micro-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-micro-2.0.3_66.snsr) |
| English (US) | general | 7 | [opt-vg-vad-stt-enUS-general-nano-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-nano-2.0.3_66.snsr) |
| English (US) | general | 199 | [opt-vg-vad-stt-enUS-general-large-2.0.3-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-large-2.0.3-pnc_66.snsr) |
| English (US) | general | 67 | [opt-vg-vad-stt-enUS-general-medium-2.4.3-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-medium-2.4.3-pnc_66.snsr) |
| English (US) | general | 28 | [opt-vg-vad-stt-enUS-general-small-2.2.3-pnc_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enUS-general-small-2.2.3-pnc_66.snsr) |
| English (British) | general | 196 | [opt-vg-vad-stt-enGB-general-large-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enGB-general-large-2.0.3_66.snsr) |
| English (British) | general | 64 | [opt-vg-vad-stt-enGB-general-medium-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enGB-general-medium-2.0.3_66.snsr) |
| English (British) | general | 25 | [opt-vg-vad-stt-enGB-general-small-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-enGB-general-small-2.0.3_66.snsr) |
| German | general | 199 | [opt-vg-vad-stt-deDE-general-large-2.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-deDE-general-large-2.2.3_66.snsr) |
| German | general | 64 | [opt-vg-vad-stt-deDE-general-medium-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-deDE-general-medium-2.3.3_66.snsr) |
| German | general | 25 | [opt-vg-vad-stt-deDE-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-deDE-general-small-2.3.3_66.snsr) |
| French | general | 202 | [opt-vg-vad-stt-frFR-general-large-2.0.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-frFR-general-large-2.0.3_66.snsr) |
| French | general | 64 | [opt-vg-vad-stt-frFR-general-medium-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-frFR-general-medium-2.3.3_66.snsr) |
| French | general | 25 | [opt-vg-vad-stt-frFR-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-frFR-general-small-2.3.3_66.snsr) |
| Italian | general | 197 | [opt-vg-vad-stt-itIT-general-large-1.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-itIT-general-large-1.2.3_66.snsr) |
| Italian | general | 64 | [opt-vg-vad-stt-itIT-general-medium-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-itIT-general-medium-2.3.3_66.snsr) |
| Italian | general | 25 | [opt-vg-vad-stt-itIT-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-itIT-general-small-2.3.3_66.snsr) |
| Japanese | general | 215 | [opt-vg-vad-stt-jaJP-general-large-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-jaJP-general-large-2.3.3_66.snsr) |
| Japanese | general | 64 | [opt-vg-vad-stt-jaJP-general-medium-2.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-jaJP-general-medium-2.2.3_66.snsr) |
| Japanese | general | 26 | [opt-vg-vad-stt-jaJP-general-small-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-jaJP-general-small-2.4.3_66.snsr) |
| Korean | general | 215 | [opt-vg-vad-stt-koKR-general-large-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-koKR-general-large-2.3.3_66.snsr) |
| Korean | general | 64 | [opt-vg-vad-stt-koKR-general-medium-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-koKR-general-medium-2.4.3_66.snsr) |
| Korean | general | 25 | [opt-vg-vad-stt-koKR-general-small-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-koKR-general-small-2.4.3_66.snsr) |
| Spanish | general | 197 | [opt-vg-vad-stt-esES-general-large-2.2.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-esES-general-large-2.2.3_66.snsr) |
| Spanish | general | 64 | [opt-vg-vad-stt-esES-general-medium-2.4.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-esES-general-medium-2.4.3_66.snsr) |
| Spanish | general | 25 | [opt-vg-vad-stt-esES-general-small-2.3.3_66](https://cc2.fluent-speech.com/cloud/v12/opt-vg-vad-stt-esES-general-small-2.3.3_66.snsr) |
**Provenance:**
The wake word, and the speech-to-text acoustic, language, and NLU models
are owned by Sensory and have no third-party dependencies.
*[API]: Application Programming Interface
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[PNC]: Punctuation and Capitalization, an STT model variant that emits cased text with punctuation
*[SDK]: Software Development Kit
*[SLM]: Generative Small Language Model
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/index.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/"
---
# Models
This distribution includes sample models in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/model/_
You can also [download models](https://doc.sensory.com/tnl/7.8/models/downloads.md#models-downloads) for additional languages
in a range of sizes.
Console examples in this section assume _$HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0_ as the SDK
install directory; replace that prefix if you installed elsewhere.
## Wake words
See the [wake word model type](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) for description of
model behavior and settings.
### spot-voicegenie-enUS-6.5.1-m.snsr
Fixed-phrase "Voice Genie" wake word for US English.
### spot-hbg-enUS-1.4.0-m.snsr
Fixed-phrase "Hello Blue Genie" wake word for US English.
### spot-music-enUS-1.2.0-m.snsr
Music command set for US English. Commands include
"play music", "pause music", "stop music", "previous song", and "next song".
## Adapting wake word
See the [adapting wake word model type](https://doc.sensory.com/tnl/7.8/models/types/ca.md#ca-type) for a description of
model behavior and settings.
### ca-voicegenie-enUS-1.1.0.snsr
This is a fixed-phrase spotter for "Voice Genie" in US English that adapts to users'
speech to improve false-accept rates.
Model adaptation and enrollment happens automatically and without any
additional code requirements — you can use this model as a drop-in replacement for the fixed-phrase [Voice Genie](https://doc.sensory.com/tnl/7.8/models/index.md#spot-voicegenie-enUS) spotter.
Configuration settings of particular interest include [cache-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#cache-file) and [max-users](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-users).
Use [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), and [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user) to manage user enrollments.
**Note:**
This model requires support for [multi-threading](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#thread-support).
### Details
The reported [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) value changes once
enrollment has identified and enrolled a new speaker.
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
# two different speakers saying "voice genie"
% bin/snsr-eval -t model/ca-voicegenie-enUS-1.1.0.snsr\
-s cache-file=ca-vg-cache.snsr\
-s max-users=3
2235 3045 voice_genie
6810 7545 voice_genie
11745 12525 user1/voice_genie
16845 17595 user1/voice_genie
29355 30180 voice_genie
34845 35820 user2/voice_genie
37815 38520 user1/voice_genie
40080 40905 user2/voice_genie
^C
# restart, loading enrollments from the cache file
% bin/snsr-eval -t model/ca-voicegenie-enUS-1.1.0.snsr\
-s cache-file=ca-vg-cache.snsr\
-s max-users=3
12045 13035 user2/voice_genie
15180 15840 user1/voice_genie
17745 18465 user1/voice_genie
20175 20820 user2/voice_genie
^C
```
## Wake word enrollment
See the [wake word enrollment model type](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type) for a description of
model behavior and settings.
### eft-hbg-enUS-23.0.0.9.snsr
EFT spotter for "Hello Blue Genie", US English.
This model produces wake words with a low imposter accept rate.
### udt-universal-3.67.1.0.snsr
UDT enrollment. This model creates spotters with nine different operating points and
supports multiple languages.
Optimized for German, English (Australian, British, Indian, United States),
Spanish (European Union, North American),
French (European Union),
Italian, Korean, Brazilian Portuguese, and
Mandarin Chinese.
### udt-enUS-5.1.1.9.snsr
UDT enrollment with backwards compatibility.
**Note:**
This older model produces enrolled wake words with reduced accuracy.
Use this model only when targeting a [THF Micro][] 3.x DSP port,
or when the wake word is followed by additional validation.
## VAD
See the [VAD model type](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) for a description of
model behavior and settings.
### vad-ml-3.0.0.snsr
Deep-learned stand-alone Voice Activity Detector.
## LVCSR _(TrulyNatural only)_
See the [LVCSR model type](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) for a description of
model behavior and settings.
### lvcsr-build-enUS-14.0.2-5MB.snsr _(TrulyNatural only)_
US English recognizer with 4.9 MiB acoustic model and support for [grammar-based recognition](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition).
Supports classes and NLU. Use [search.frame-nota](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#searchframe-nota) to adjust out-of-grammar rejection.
**Example:**
```
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% snsr-eval -t model/lvcsr-build-enUS-14.0.2-5MB.snsr\
-s partial-result-interval=0 \
-f grammar-stream data/grammars/enrollments-nlu-slot.txt \
data/enrollments/armadillo-1-2-c.wav
NLU intent: navigate (0) = how far away is winco
NLU entity: place (0) = winco
285 1995 armadillo how far away is winco
```
### lvcsr-lib-enUS-14.0.2.snsr _(TrulyNatural only)_
US English class library. This provides pre-compiled classes for
common domains. Use [lvcsr-build-enUS](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-build-enUS) to simplify
[grammar-based recognition](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition) grammars.
See [class libraries](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-class-libraries) for a usage example.
_lvcsr-lib-enUS-14.0.2.json_ provides the content of the class
description table below in machine-readable format.
### Details: lvcsr-lib-enUS-14.0.2.snsr class library
Class Name
Description
s.alarm-phrases
Basic commands for alarm.
- language: enUS
- version: 0.0.1
- description: Basic commands for alarm.
- detail: Phrases for setting an alarm, including specific times (see s.time).
- examples: wake me up at
s.alphanumeric
Matches an individual letter (a-z) or an individual integer (0-9).
- language: enUS
- version: 0.0.2
- description: Matches an individual letter (a-z) or an individual integer (0-9).
- detail: Individual alphabet letters including enUK 'zed' and adjectives used for spelling, plus individual numbers zero through nine, including 'oh'.
- examples: zero, oh, , , zed, cap , capital , big , lowercase , uppercase , little , double
- category: characters
s.call-emergency
Common ways of calling for help in case of emergency.
- language: enUS
- version: 0.0.1
- description: Common ways of calling for help in case of emergency.
- detail: Ways to call emergency services; does not include "help" or "help me". Use this grammar with caution.
- examples: , it's an emergency, it is an emergency, this is an emergency, i'm having an emergency, we're having an emergency
- category: emergency
s.clock-phrases
Basic commands for setting a clock.
- language: enUS
- version: 0.0.1
- description: Basic commands for setting a clock.
- detail: Phrases for setting a clock, including specific times (see s.time).
- examples: time, what time is it, what is the time, what's the time, set time to , change time to , set clock to , change clock to
- category: command sets
Matches individual color names.
- language: enUS
- version: 0.0.4
- description: Matches individual color names.
- detail: Primary and secondary colors.
- examples: black, blue, brown, gray, green, orange, pink, purple, red, white, yellow
- category: color
s.control.car
Simple command set for car voice control.
- language: enUS
- version: 0.0.1
- description: Simple command set for car voice control.
- detail: Basic commands for car voice control, including door controls, window controls, environment controls, basic stereo control, mirror/wiper/lights control.
- examples: open driver's side window, close the passenger's side window, roll down front windows, roll up all the windows, lock passenger's side door, child lock back doors, unlock front doors, turn off front defroster, turn on heat, turn up heater, turn down the A C, turn fan up, turn on windshield wipers, open garage, close garage door, turn on navigation, end navigation, turn on radio, turn the front speakers up, turn down the rear speakers, turn up treble, turn the bass down, increase back wiper speed, decrease wiper speed, unfold side mirrors, fold right side mirror, turn on driver's side seat warmer, turn the passenger's side seat warmer down, turn on the dome light
- category: command sets
s.control.door
Basic commands for smart lock/door control.
- language: enUS
- version: 0.0.1
- description: Basic commands for smart lock/door control.
- detail: Simple command set for controlling a smart lock/door.
- examples: lock the door, unlock my door, open door, close the door, is the door open, is my door open
- category: command sets
s.control.environment
General, basic commands for environment control.
- language: enUS
- version: 0.0.1
- description: General, basic commands for environment control.
- detail: Commands for environment control devices, such as fans, AC, space heaters.
- examples: turn on, turn off, turn up, turn down, start, stop, set speed to, change speed to, set speed to , change speed to , set to percent, change to percent, set to , change to
- category: command sets
s.control.lights
Simple command set for voice-controlled lights.
- language: enUS
- version: 0.0.1
- description: Simple command set for voice-controlled lights.
- detail: Basic commands for voice-controlled lights.
- examples: turn on all the hallway lights, turn off closet light, dim the den lights, brighten all the master bath lights, set foyer light to off, turn all the breakfast nook lights on
- category: command sets
s.control.media
Basic media control phrases.
- language: enUS
- version: 0.0.1
- description: Basic media control phrases.
- detail: Basic, common controls for music, movies, etc.
- examples: play, pause, stop, skip, skip to, next, fast forward, back, rewind, fast rewind, reverse, repeat, start, shuffle
- category: command sets
s.control.media.tv
Simple command set for voice-controlled television.
- language: enUS
- version: 0.0.1
- description: Simple command set for voice-controlled television.
- detail: Basic commands for voice-controlled television, see s.switch for specific subcommands for "".
- examples: cable, music, browser, apps, streaming, channel , recordings, guide, input , turn on the t v, switch off my television, t v off, television on, switch t v off, turn television on, power on, power off, turn volume to , adjust volume down , turn volume up, mute t v, unmute the television, turn on closed captioning, next channel, channel up, channel down
- category: command sets
s.control.media.volume
Simple commands for audio volume control.
- language: enUS
- version: 0.0.1
- description: Simple commands for audio volume control.
- detail: Typical, general commands for controlling audio volume, for home/car smart speaker, television, etc.
- examples: increase the , decrease the , turn up the , turn down the , turn the up, turn to five, turn up to ten, turn down to one, mute , louder, softer, quieter, up, down
- category: command sets
s.control.phone
Basic commands for calling/messaging control.
- language: enUS
- version: 0.0.2
- description: Basic commands for calling/messaging control.
- detail: Typical, open-ended calling/messaging commands for voice-control assistant on phone. All can be followed by a specific entity/contact name.
- examples: send text, send a text to, send voice message, send audio message to, reply to, text to, show message, play message from, show emails, show me emails from, read my recent messages, show my new messages from, play all voice messages, play all voicemail, play new voicemail messages from, send email, send an email to, show contacts, call
- category: command sets
s.control.thermostat
Basic commands for thermostat.
- language: enUS
- version: 0.0.1
- description: Basic commands for thermostat.
- detail: Phrases for setting an thermostat, including specific temperatures in C or F (see s.temperature.thermostat.celsius and s.thermostat.fahrenheit).
- examples: set thermostat to , turn thermostat up degrees, turn thermostat down degrees, make it much warmer, make it much cooler, make it degrees warmer, make it degrees cooler, what is the temperature
- category: command sets
s.control.vacuum
Basic commands for vacuum cleaner.
- language: enUS
- version: 0.0.1
- description: Basic commands for vacuum cleaner.
- detail: Phrases for a voice-controlled vacuum cleaner, including room names to direct vacuum (see s.rooms).
- examples: start vacuuming, stop vacuuming, resume vacuuming, end vacuuming, pause vacuuming, unpause vacuuming, start vacuum, stop vacuum, pause vacuum, unpause vacuum, dock vacuum, charge vacuum, where is the vacuum, is the vacuum charging, is the vacuum charged, is vacuum docked, vacuum the , start vacuuming the , resume vacuuming the , end vacuuming the
- category: command sets
s.control.virtual-meeting
Basic commands for controlling a virtual meeting platform.
- language: enUS
- version: 0.0.1
- description: Basic commands for controlling a virtual meeting platform.
- detail: Assortment of basic commands for interacting with and controlling a virtual meeting platform.
- examples: mute, mute self, mute all, unmute, unmute self, unmute all, initiate new meeting, new meeting, schedule meeting, go to upcoming meetings, past meetings, go to recordings, go to chat, help, test audio, test video, switch microphone, switch video, leave meeting, blur background
- category: command sets
s.date
Common ways of saying individual dates.
- language: enUS
- version: 0.0.1
- description: Common ways of saying individual dates.
- detail: Covers dates from January 1, 1800 to December 31, 2099; with and without years.
- examples: first of january, the second of february, the third of march two thousand fifteen, the fourth of april two thousand and fifteen, may fifth, june the sixth, july seventh nineteen eighty, august the eighth twenty oh two
- category: date
s.duration-queries
General queries about how long something is, to be followed by nouns.
- language: enUS
- version: 0.0.1
- description: General queries about how long something is, to be followed by nouns.
- detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for a variety of duration applications. (Does not include phrases specific to setting a timer; see s.timer-phrases.grm for this application.)
- examples: how long is, for how long is, what's the duration of, what is the length of time of, please show me how long is, tell me the duration of, I want to know what the length of time is of
- category: commands
s.email
Common ways of spelling out individual email addresses.
- language: enUS
- version: 0.0.1
- description: Common ways of spelling out individual email addresses.
- detail: Produces a spelled-out email address, with common domains or custom, spelled-out domain.
- examples: a b c at gmail dot com, h e l l o one two three at yahoo dot com, m y dot e m a i l underscore a d d r e s s at outlook dot com, one dash two dash three at i cloud dot com, a plus b plus c at aol dot com, x y z at hotmail dot com, i j k at ms dot com, e m a i l at m y d o m a i n dot org, e m a i l at y o u r dash d o m a i n dot com
- category: email
s.help
Device assistance commands.
- language: enUS
- version: 0.0.2
- description: Device assistance commands.
- detail: Common ways to ask for help with a device.
- examples: help, help me, help menu, what do i say, what can i do, how can i use this, how do i use this thing
- category: help
s.increase-decrease
Increase/decrease commands.
- language: enUS
- version: 0.0.2
- description: Increase/decrease commands.
- detail: Generic increase and decrease language. No "bump it", "make it quieter/hotter", etc.
- examples: turn up, turn it down, decrease, increase, crank it, crank up
- category: commands
s.integer-billions
Matches individual long forms of integers from 1 billion to 999 billion.
- language: enUS
- version: 0.0.1
- description: Matches individual long forms of integers from 1 billion to 999 billion.
- detail: Does not include common rounded/float versions (ie. 1.5 billion), or 'a billion' (for easier integration in a larger number set).
- examples: one billion, twelve billion five million and three hundred, ninety billion and ninety nine million nine thousand and one, one hundred eighty billion twenty one million, three hundred twenty one billion and eighty two million and two, nine hundred ninety nine billion nine hundred ninety nine million nine hundred ninety nine thousand nine hundred and ninety nine
- category: numbers
s.integer-millions
Matches individual long forms of integers from 1 million to 999 million.
- language: enUS
- version: 0.0.1
- description: Matches individual long forms of integers from 1 million to 999 million.
- detail: Does not include common rounded/float versions (ie. 1.5 million), or 'a million' (for easier integration in a larger number set).
- examples: one million, one million one hundred thousand, one million one hundred thousand and one, two million and sixteen, three hundred million three thousand and two, nine hundred ninty nine million nine hundred ninety nine thousand nine hundred and ninety nine
- category: numbers
s.integer-thousands
Matches individual numbers from one thousand to 999 thousand.
- language: enUS
- version: 0.0.1
- description: Matches individual numbers from one thousand to 999 thousand.
- detail: Does not include 'a thousand', etc, for easier integration in a larger number set.
- examples: one thousand, two thousand and one, two thousand eight hundred and ninety nine, nine hundred ninety nine thousand nine hundred and ninety nine
- category: numbers
General queries to be followed by place names.
- language: enUS
- version: 0.0.1
- description: General queries to be followed by place names.
- detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for location-app-specific purposes.
- examples: where is, what are the directions to, map my route to, show the way to, I need a map to, tell me how to get to, how do I get to, guide me to, what's the way I could drive to, what is the way I might find, how do you drive to, tell me how to locate, how would I get to, get me to, how might one drive to, could you please help me find, please tell me where one would locate
- category: commands
s.money
Matches individual US currency expressions.
- language: enUS
- version: 0.0.1
- description: Matches individual US currency expressions.
- detail: Combinations of cents and dollars up to 100 dollars.
- examples: zero cents, zero dollars, one cent, one dollar, cents, , , and cents, ('two ninety nine'), oh ('three oh five')
- category: money
s.noun-queries
General queries about where something is, to be followed by nouns.
- language: enUS
- version: 0.0.1
- description: General queries about where something is, to be followed by nouns.
- detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for a variety of applications. (Does not include location-app-specific queries involving words "drive", "guide", "route", or "map".)
- examples: find, get, locate, show, show me, help me find, help me locate, please locate, where is, what's the way to get, what's the way one might get to, what's the way you can get to, what's the way I could locate, what's the way to see, show me where to find, please show me where one would get, tell me how I find, please tell me how one might get, where would I see, where might one locate, how can one get to, how would I find, how do I see, I want to know how one might get to
- category: commands
s.number-integer-0-1trillion
Matches individual cardinal numbers zero through one trillion.
- language: enUS
- version: 0.0.1
- description: Matches individual cardinal numbers zero through one trillion.
- detail: Zero through one trillion, with optional 'and' between trillion, millions, thousand, etc. components.
- examples: zero, one, ninety eight, one hundred ninety eight, six thousand one hundred and ninety eight, five million six thousand six hundred thousand thirty two, one and a half billion, a trillion, one trillion
- category: numbers
s.number-integer-0-9
Matches individual cardinal numbers zero through nine.
- language: enUS
- version: 0.0.1
- description: Matches individual cardinal numbers zero through nine.
- detail: Zero through nine.
- examples: zero, one, two, three, four, five, six, seven, eight, nine
- category: numbers
s.number-integer-0-100
Matches cardinal numbers zero through one hundred.
- language: enUS
- version: 0.0.1
- description: Matches cardinal numbers zero through one hundred.
- detail: Zero through one hundred, including 'a hundred'.
- examples: zero, one, eleven, thirty, thirty four, ninety nine, one hundred, a hundred
- category: numbers
s.number-integer-0-999
Matches individual cardinal numbers zero through nine hundred ninety nine.
- language: enUS
- version: 0.0.3
- description: Matches individual cardinal numbers zero through nine hundred ninety nine.
- detail: Zero through nine hundred ninety nine, with optional 'and' between hundreds and tens component.
- examples: zero, one, eleven, twenty one, one hundred twelve, nine hundred and ninety nine
- category: numbers
Matches individual cardinal numbers one hundred thousand through nine hundred ninety nine thousand nine hundred ninety nine.
- language: enUS
- version: 0.0.1
- description: Matches individual cardinal numbers one hundred thousand through nine hundred ninety nine thousand nine hundred ninety nine.
- detail: One hundred thousand through nine hundred ninety nine thousand nine hundred ninety nine, with optional 'and' between thousands, hundreds, etc., components.
- examples: one hundred thousand, one hundred and two thousand, four hundred twenty one thousand and three hundred, forty one thousand one hundred and three
- category: numbers
On and off commands.
- language: enUS
- version: 0.0.1
- description: On and off commands.
- detail: Common expressions for turning devices on and off.
- examples: turn off, turn on, switch off, switch on, turn off, turn on, start, stop
- category: on-off
s.ordering
General, open-ended phrases for ordering/requesting.
- language: enUS
- version: 0.0.1
- description: General, open-ended phrases for ordering/requesting.
- detail: An assortment of open-ended phrases for ordering which can be combined with specific entities for a variety of applications.
- examples: may I have, may I please get, may I try a, may I order one, can I try, can I please order, can I grab two, can I please have three, could I get, could I please have several, could I try that, could I please order this, I'll take, I'll take the, I'd like to have, I'd like to have several, I want, I want three, give me, give me one, please give me, please give me the
- category: commands
s.percent
Matches individual percentages using cardinal numbers.
- language: enUS
- version: 0.0.2
- description: Matches individual percentages using cardinal numbers.
- detail: Percents zero to one hundred (with 'percent' unit).
- examples: percent, one hundred percent, a hundred percent
- category: percent
s.phone-number
Matches individual phone numbers.
- language: enUS
- version: 0.0.1
- description: Matches individual phone numbers.
- detail: Common ways to say 10 digit phone numbers in the US; includes options for 'one' and 'nine' international dialing code, eight-hundred. Limits area codes to 200-900 (US). Interchangeable 'zero' and 'oh'.
- examples: one two three four five six seven eight nine oh, four one oh three three oh nine two nine two, one three zero nine three five five two zero two one
- category: phone-number
s.rooms
Matches an individual room name.
- language: enUS
- version: 0.0.2
- description: Matches an individual room name.
- detail: Individual names for types of rooms, for homes and businesses.
- examples: porch, living room, parlor, entry, entry way, entry room, den, breakfast nook, hallway, mud room, kitchen, bathroom, master bath, master bathroom, restroom, bedroom, guest room, play room, dining room, upstairs, downstairs, laundry room, entrance, basement, pantry, family room, foyer, den, sunroom, library, studio, nursery, office, home office, rec room, recreation room, attic, conference room, conference room, meeting room, reception, reception area, server room, break room, wellness room
- category: rooms
s.single-digit-integer
Matches individual numbers one through nine.
- language: enUS
- version: 0.0.2
- description: Matches individual numbers one through nine.
- detail: Does not include zero, as it may not apply to all use cases.
- examples: one, two, three, four, five, six, seven, eight, nine
- category: numbers
s.single-digit-ordinal
Matches individual ordinal numbers 'first' through 'ninth'.
- language: enUS
- version: 0.0.2
- description: Matches individual ordinal numbers 'first' through 'ninth'.
- detail: First through ninth, does not include 'zeroth'.
- examples: first, second, third, fourth, fifth, sixth, seventh, eighth, ninth
- category: numbers
s.special-character
Commonly occuring special characters.
- language: enUS
- version: 0.0.2
- description: Commonly occuring special characters.
- detail: Ways to speak common special characters like punctuation.
- examples: period, comma, stop, at sign, hashtag, pound sign, dollar sign, plus, curly bracket, right curly brace, open angle bracket, open paren, question mark, exclamation mark, apostrophe, pipe, colon, underscore, carat
- category: characters
s.switch
Open-ended commands for general navigation.
- language: enUS
- version: 0.0.1
- description: Open-ended commands for general navigation.
- detail: Common commands for switching from one item to another, in a menu, on a television, and more.
- examples: go to, switch to, watch, turn to, change to, put on, tune to
- category: commands
s.temperature.oven.celsius
Matches individual temperatures 100 C to 300 C for use with household ovens.
- language: enUS
- version: 0.0.2
- description: Matches individual temperatures 100 C to 300 C for use with household ovens.
- detail: Values from one hundred to three hundred, with optional 'degree' and/or 'celsius' appended.
- examples: , degrees, degrees celsius
- category: temperature
s.temperature.oven.fahrenheit
Matches individual temperatures 200 C to 500 F for use with household ovens.
- language: enUS
- version: 0.0.1
- description: Matches individual temperatures 200 C to 500 F for use with household ovens.
- detail: Values from two hundred to five hundred with optional "degrees" and "fahrenheit" appended.
- examples: , degrees, degrees fahrenheit
- category: temperature
s.temperature.thermostat.celsius
Matches individual temperatures 10 C to 40 C (thermostat).
- language: enUS
- version: 0.0.1
- description: Matches individual temperatures 10 C to 40 C (thermostat).
- detail: Values from ten to forty, with optional "degrees" and "celsius", for use with household thermostats.
- examples: , degrees, degrees celsius
- category: temperature
s.temperature.thermostat.fahrenheit
Matches individual temperatures 40 to 100 (thermostat).
- language: enUS
- version: 0.0.2
- description: Matches individual temperatures 40 to 100 (thermostat).
- detail: Values from forty to one hundred, with optional "degrees" and "fahrenheit", for use with household thermostats.
- examples: , degrees, fahrenheit, degrees fahrenheit
- category: temperature
s.time
Colloquial time/clock phrases in US English (no 'military' time).
- language: enUS
- version: 0.0.1
- description: Colloquial time/clock phrases in US English (no 'military' time).
- detail: 1 through 12 pm and am, half past/quarter till/ten to etc., o'clock, noon/midnight, afternoon/morning/evening/night matched to their common equivalent numeric hours (with some overlap in evening, night, and afternoon).
- examples: five thirteen, seven thirty a m, eight o'clock p m, twenty till eight, five past noon, ten to midnight, a quarter after four, quarter before nine, six thirteen in the morning, eight o'clock in the evening, a quarter past ten o'clock
- category: time
s.timer-phrases
Basic commands for setting a timer.
- language: enUS
- version: 0.0.1
- description: Basic commands for setting a timer.
- detail: Phrases for setting a timer, including specific durations (details in s.timer).
- examples: the timer, set a timer for , please start the timer for , start a timer, set timer, timer, how much time is left on my timer, wake me in
- category: command sets
s.timer
Durations for setting timers and alarms.
- language: enUS
- version: 0.0.1
- description: Durations for setting timers and alarms.
- detail: Seconds, minutes, hours and combinations for setting timers and alarms.
- examples: a sec, a second, one second, a minute, one minute, half hour, a half hour, one half hour, an hour, one hour, one hour and a half, an hour and a half, seconds, minutes, hours, minutes seconds, minutes and seconds, hours minutes, hours and minutes
- category: timer
General queries about how something is done, to be followed by verbs.
- language: enUS
- version: 0.0.1
- description: General queries about how something is done, to be followed by verbs.
- detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with actions for a variety of applications.
- examples: how do I, show me how I, tell me how you, please show me how I can, tell me how I might, please show me how you, please help me to, how to, I want to know how I, how might I, help me, how can I, how to, I want to know how you, I want to know how you can, please tell me how you might
- category: commands
s.weight
Matches individual weights, in pounds and ounces.
- language: enUS
- version: 0.0.2
- description: Matches individual weights, in pounds and ounces.
- detail: Combinations of pounds and ounces, up to 100 pounds and 100 ounces. Allows for decimal pound amounts.
- examples: one pound, one ounce, pounds, one pound ounces, one pound and ounces, pounds ounces, pounds and ounces, point pounds, one pound one ounce, and a half pounds
- category: weight
s.when-queries
General queries about when something is, to be followed by nouns.
- language: enUS
- version: 0.0.1
- description: General queries about when something is, to be followed by nouns.
- detail: An assortment of open-ended queries involving multiple interrogatives which can be combined with entities for a variety of time applications. (Does not include phrases specific to setting a timer; see s.timer-phrases.grm for this application.)
- examples: what time is, at what hour is, what's the time of day of, what is the time of, please show me what hour is, please show when I might get, please show me when one might get, please tell me the hour of, tell me when you are going to, show me when is, I want to know when you can get, I want to know the time of, I want to know when is
- category: commands
s.yes-no
Common yes/no responses.
- language: enUS
- version: 0.0.2
- description: Common yes/no responses.
- detail: Common ways of saying yes or no. Does not include things like "nah I'm good", "right", "thanks", and "that's it".
- examples: yep, yup, yeah, yeah sure, sure, yes, yes please, please, okay, nope, no, no thanks, nah, no thank you
- category: yes-no
## STT _(STT only)_
See the [STT model type](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) for a description of
model behavior and settings.
You can [download](https://doc.sensory.com/tnl/7.8/models/downloads.md#models-downloads) additional models for other languages
in a range of different sizes.
### stt-enUS-automotive-medium-2.3.15-pnc.snsr _(STT only)_
STT recognizer with broad-domain support and special
focus on automotive command-and-control tasks. It includes a machine-learned
NLU component that identifies automotive intents and entities.
Results include capitalization and punctuation.
This model requires STT support, which currently depends on third-party
Open Source modules that are optionally included in the TrulyNatural SDK
See [Open Source Licenses](https://doc.sensory.com/tnl/7.8/licenses/oss.md#open-source-licenses) for details.
All model components (acoustic, language, and NLU) are owned by Sensory.
### Details: NLU intents and entities
| Intent | Entities | Examples |
|:-------|:---------|:---------|
| activate_car_alarm | | car alarm on set the car alarm |
| activate_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, duration, percentage_value, trunk | turn on the headlights flash the brights cabin lights on |
| adjust_mirror | front, rear, driver_side, left_side, passenger_side, right_side, side, rearview, direction, adjust_type, percentage_value, length_unit length_unit | adjust driver side mirror fold in the side mirrors lower rearview mirror set the passenger side mirror up one inch |
| affirm | | confirm please |
| answer_call | contact_name | accept my call answer call from ylana |
| average_m_p_g | | check gas mileage |
| battery_level | number_unit number_unit | check the battery level |
| bot_challenge | | am i talking to a human are you a bot |
| call_contact | contact_name, message | call jeff make a call to marco |
| call_emergency | ems | call nine one one call the police |
| call_end | | hang up call |
| call_general | | digit dial make a call |
| call_number | phone_number | call seven six five seven three zero six four one five |
| camera_off | camera, front, rear, rearview, side | back camera off |
| camera_on | camera, front, rear, rearview, side | activate all cameras turn on the dash cam |
| cancel | | cancel quit |
| change_gears | gear | put it in reverse shift into park |
| change_temp_unit | temperature_unit | change to celsius |
| change_time_zone | time_zone | change time zone to mountain standard time |
| check_messages | contact_name | check my messages how many messages in my inbox |
| climate_sync | hvac, left_side, right_side | turn on climate sync turn off synchronization of the a c |
| close_door | front, rear, driver_side, left_side, passenger_side, right_side, fuel_door, garage_door, side_door, van_door | back doors shut close driver's side door lower garage door |
| close_glove_box | glove_box | shut the cubby-hole close the jocky box raise the glove compartment |
| close_hood | hood, rear, trunk rear, hazard_lights | close my bonnet slam the hood |
| close_trunk | trunk | close the tailgate lower the trunk shut the boot |
| close_window | front, rear, driver_side, left_side, passenger_side, right_side, roof, percentage_value, adjust_type, number_unit, length_unit adjust_type, number_unit, length_unit | close the sunroof close the back passenger's side windows all the way front window up |
| connect_bluetooth | | bluetooth connect device car connect cell phone pair headphones |
| current_speed | speed_unit | display speed tell me what the speedometer says |
| d_v_d_player | adjust_type, percentage_value, front, rear | play a d v d for the kids |
| deactivate_car_alarm | | deactivate safety system silence the car alarm |
| deactivate_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, duration, percentage_value, trunk | all lights off cabin light off deactivate adaptive driving beam |
| decline_call | contact_name | decline call from ylana dismiss incoming call |
| decrease_cruise_control | speed_unit, adjust_type, percentage_value, number_unit | decrease cruise control speed by eighty kilometeres per hour decrease cruise forty lower the speed of the cruise |
| decrease_display | rear, front | dim the screen |
| decrease_fan_speed | front, rear, driver_side, left_side, passenger_side, right_side, hvac, adjust_type, percentage_value, duration, number_unit, safety_lock duration, number_unit, safety_lock | change fan to slower decrease fan some |
| decrease_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, trunk | dim interior lights turn down the dashboard lights |
| decrease_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, adjust_type, percentage_value | decrease my seat warmer make warmers cooler set all of my seat warmers down |
| decrease_steering_wheel_warmer | | |
| decrease_temperature | adjust_type, percentage_value, front, rear, driver_side, passenger_side, hvac, number_unit, temperature_unit | decrease heat decrease temperature by ten degrees celsius it's hot in here |
| decrease_volume | adjust_type, percentage_value, front, rear, number_unit front, rear, number_unit | crank audio level down audio softer |
| decrease_wiper_speed | front, rear, percentage_value, number_unit | decrease back windshield wiper speed lower the front wiper by a little bit |
| deny | | do not send no |
| feature_list | | what can i say what features do you have again |
| fuel_level | | check fuel level do i need to get gas soon how empty is the petrol tank |
| give_duration | duration | five minutes for two minutes |
| give_name | contact_name | jeff megan |
| give_time | time | ten a m three thirty p m |
| good_bye | | see you later |
| greet | | hello |
| honk_horn | | beep the horn honk |
| increase_cruise_control | speed_unit, adjust_type, percentage_value, number_unit | bump up ths speed by three miles per hour increase cruise speed ten |
| increase_display | rear, front | make the screen brighter increase display brightness |
| increase_fan_speed | front, rear, driver_side, left_side, passenger_side, right_side, hvac, adjust_type, percentage_value, duration, number_unit, safety_lock | crank up the fans change fan speed to higher speed |
| increase_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, trunk, driver_side, right_side | make the dome lights brighter |
| increase_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, adjust_type, percentage_value | change seat warmer faster speed increase all my seat warmer |
| increase_steering_wheel_warmer | | |
| increase_temperature | adjust_type, percentage_value, front, rear, driver_side, passenger_side, hvac, number_unit, temperature_unit, temperature_unit | crank up heater i'm extremely cold increase the a c by fifty percent make it warmer |
| increase_volume | adjust_type, percentage_value, front, rear, number_unit, artist | change the volume louder crank up the music level |
| increase_wiper_speed | front, rear, percentage_value, number_unit | all of my wipers higher |
| lock_door | front, rear, driver_side, left_side, passenger_side, right_side, safety_lock, van, fuel, hood, trunk, etc. | activate child safety locks doors lock up |
| lock_window | front, rear, driver_side, left_side, passenger_side, right_side | activate window child safety lock |
| lower_steering_wheel | adjust_type | move my steering wheel down |
| manual_default | | display vehicle diagnostics where is the driver's manual |
| manual_garage | | how do i program the garage remote |
| manual_set_memory | | can i save the passenger seat position save the driver seat memory two |
| manual_topic | help_feature | alarm help help cameras how do I adjust the headrest manual page for clock settings what is the recommended tire pressure for my car |
| music_player | player_action, album_title, artist, genre, podcast, song_title | play ,, find me some music by listen to , |
| navigation | navigation_location | how do I reach the nearest railway station best way to the closest park |
| no_command | *any* | *all unrecognized commands* |
| odometer_reset | | first trip counter to zero reset odometer |
| odometer_total | | current mileage how many miles on car |
| odometer_trip | | display trip distance what's the trip odometer |
| oil_level | number_unit | check oil level how much oil is left |
| open_door | front, rear, driver_side, left_side, passenger_side, right_side, fuel_door, garage_door, roof, side_door, van_door, percentage_value, roof | front right door open my sliding doors pop open open up all of the doors |
| open_glove_box | glove_box | lower the glove compartment open the cubby hole |
| open_hood | hood, hazard_lights | bonnet open please open the hood |
| open_trunk | trunk, rear | boot open my tailgate door pop open open the barn doors |
| open_window | front, rear, driver_side, left_side, passenger_side, right_side, roof, percentage_value, adjust_type, number_unit, length_unit adjust_type, number_unit, length_unit | roll down my windows crack the moonroof open lower windows by twenty five percent |
| query_airbag | driver_side, passenger_side, right_side, left_side | is the passenger airbag engaged |
| query_blinker | right_side, left_side, driver_side, passenger_side | are my blinkers on turn signal status |
| query_car_mode | car_mode | car mode status what's the car mode |
| query_cruise_control | speed_unit | what is the cruise set at what speed is the cruise on |
| query_date | date | what is today's date |
| query_defrost | front, rear | is front defrost on what is the status of the rear defrost |
| query_door_lock | front, rear, driver_side, left_side, passenger_side, right_side, roof, safety_lock, fuel_door, garage_door, van_door, trunk, hood, glove_box | are the back doors locked did i lock the car is the child-safety lock on |
| query_door_open | front, rear, driver_side, left_side, passenger_side, right_side, roof, safety_lock, fuel_door, garage_door, van_door, trunk, hood, glove_box | is the fuel door open is the hood closed did i remember to shut the garage door which doors are open |
| query_fan | fan_direction | is the fan set to footwell where is the air blowing |
| query_fan | front, rear | are the fans on what is the fan speed |
| query_lane_assist | | is the lane keeping aid active |
| query_light | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, driver_side, left_side, passenger_side, right_side, duration, percentage_value, trunk | did i leave the low beams on is the dome light on what is the status of the daytime running lights |
| query_park_assist | | is the parking assistant on |
| query_parking_brake | | is the parking brake engaged |
| query_radio_station | | what channel is this |
| query_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, adjust_type | are the heated seats on is the front right heated seat on |
| query_speed_limit | | what's the speed limit here |
| query_steer_assist | | is the steering assistang engaged |
| query_temperature | driver_side, passenger_side, hvac, percentage_value | check how cold is it inside check temp in the car tell me the current temperature of the car |
| query_temperature_outside | date | check exterior temperature today for me how hot is it outside my car |
| query_time | time_zone | what time is it in berlin what's the time now |
| query_timer | | how much left on the timer |
| query_volume | | how loud is the radio what is the volume |
| query_warning_light | | check dashboard lights describe dashboard warning tell me what is that light |
| query_weekday | date | what day is it today what's today |
| query_window | front, rear, driver_side, left_side, passenger_side, right_side, roof, percentage_value, adjust_type, number_unit, length_unit | is the back right window open are any windows open |
| query_wipers | front, rear | are the wipers on what is the speed of the wipers |
| query_year | | what year is it |
| raise_steering_whell | | move my steering wheel closer raise steering wheel up a bit |
| range | | distance until empty how far before in need to refuel |
| reset_demo | | reset the demo start over |
| scan_radio | | scan the fm radio search for radio stations |
| seat_backwards | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | adjust driver seat back move driver seat backwards |
| seat_cooling | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | activate seat cooling increase seat ventilation |
| seat_custom | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | adjust seat to preset one disable passenger seat suspension switch my driver's seat to saved position four |
| seat_forward | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | move driver seat forward |
| seat_incline | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | move front seat back up sit up straighter |
| seat_lower | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | lower my seat two inches |
| seat_raise | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | lift up my seat make my seat taller |
| seat_recline | front, rear, driver_side, left_side, passenger_side, right_side, center, length_unit, percentage_value | lay the seat back recline the passenger seatback |
| send_message | contact_name, message | message matt i'm on my way ask batty by text are you free for dinner compose note to john in gchat dm whitney |
| service_reminder | duration, time, date, number_unit, percentage_value, time_of_day | am i scheduled for an oil change soon begin charge at two a m schedule an appointment for a tune up |
| set_car_mode | car_mode | activate traction mode enable sports mode |
| set_cruise_control | speed_unit, adjust_type, percentage_value, number_unit | cruise control set to seventy eight set cruise control feature down to fifty five |
| set_display | rear, front | adjust the screen settings |
| set_fan | cabin_vent, dual_air, floor_vent, windshield_vent, driver_side, percentage_value | blow are on my feet direct air flow to the windshield |
| set_fan | front, rear, driver_side, left_side, passenger_side, right_side, hvac, adjust_type, percentage_value, duration, number_unit, safety_lock duration, number_unit, safety_lock | activate max a c adjust fan speed to low setting change the fan to four |
| set_lights | automatic_lights, dashboard_lights, daytime_running_lights, door_lights, fog_lights, hazard_lights, high_beams, interior_lights, low_beams, parking_lights, light, front, rear, percentage_value, driver_side, left_side, passenger_side, right_side, trunk | chage cab light to max intensity |
| set_off_car_mode | car_mode | cancel hill start assist cut off auto pilot |
| set_radio | radio_station, genre | change station to w i p b listen to radio station a m ten seventy |
| set_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side, adjust_type, percentage_value | both seat warmers to three can you turn my seat heater on medium change warmers highest speed |
| set_steering_wheel_warmer | | |
| set_temperature | adjust_type, percentage_value, front, rear, driver_side, passenger_side, hvac, number_unit, temperature_unit | a c at seventy two degrees change front heater setting to low make the temperature sixty two degrees fahrenheit |
| set_time | | reset the clock |
| set_volume | adjust_type, artist, music_service, song_title | change audio volume zero mute music |
| set_volume | adjust_type, percentage_value, front, rear, number_unit, front, rear, number_unit | adjust the volume change audio to fifty percent make volume level three |
| set_wipers | front, rear, percentage_value, number_unit | all the wipers high my rear wiper highest |
| speed_limiter | | activate speed limiter decrease the governor increase limiter to forty five miles per hour |
| timer_on | duration | set a timer for ten minutes |
| timer_off | | cancel timer |
| tire_pressure | front, rear, driver_side, left_side, passenger_side, right_side | are my tires flat what is my front right tire pressure |
| turn_off_airbag | driver_side, passenger_side, left_side, right_side | activate passenger side airbag |
| turn_off_blinker | side | blinkers off deactivate left turn signal |
| turn_off_bluetooth | | bluetooth disconnect car device delete my bluetooth pairing for android |
| turn_off_car | | cut the engine disable vehicle |
| turn_off_cruise_control | | cancel cruise control |
| turn_off_defrost | front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, adjust_type | all my back glass defroster off stop defroster |
| turn_off_display | rear, front | turn off the screens |
| turn_off_fan | front, rear, hvac, adjust_type, percentage_value | a c off can you turn off the heat cut fan |
| turn_off_lane_assist | | turn off lane assist deactivate l d w |
| turn_off_navigation | | quit g p s stop navigation |
| turn_off_park_assist | | turn off parking assistant |
| turn_off_parking_brake | roof | release e brake |
| turn_off_radio | | turn off radio |
| turn_off_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side | all seat warmers off disengage right seat's seat warmers |
| turn_off_steer_assist | | disengage steering assistant |
| turn_off_steering_wheel_warmer | | |
| turn_off_wipers | front, rear | back wipers off deactivate all wipers |
| turn_on_airbag | driver_side, passenger_side, left_side, right_side | |
| turn_on_blinker | side | left blinker on turn on the right blinker |
| turn_on_bluetooth | | engage bluetooth set up bluetooth |
| turn_on_car | | begin engine start the car |
| turn_on_cruise_control | | begin cruise control feature enable cruise contol |
| turn_on_defrost | front, rear, driver_side, left_side, passenger_side, right_side, percentage_value, adjust_type | activate front defrost only demist rear windshield |
| turn_on_display | rear, front | turn on the heads up display can you turn on the rear tv screens turn on the kids screens |
| turn_on_fan | front, rear, hvac | activate a c initiate heater power on fan |
| turn_on_lane_assist | | activate l t a turn on lane keeping aid |
| turn_on_navigation | | begin navigation start the g p s |
| turn_on_park_assist | | park the car turn on parking sensors |
| turn_on_parking_brake | | activate parking brake engage hand brake |
| turn_on_radio | | turn on the radio |
| turn_on_seat_warmers | front, rear, driver_side, left_side, passenger_side, right_side | activate front left seat's seat heaters both seat heaters on |
| turn_on_steer_assist | | begin steer control |
| turn_on_steering_wheel_warmer | | |
| turn_on_wipers | front, rear | activate front wiper turn on rear wiper |
| unlock_door | front, rear, driver_side, left_side, passenger_side, right_side, roof, safety_lock, fuel_door, garage_door, van_door, trunk, hood, glove_box | all door child locks disable driver side door unlock |
| unlock_window | front, rear, driver_side, left_side, passenger_side, right_side | all my windows child lock off |
| warning_light._specific | warning_light | check engine light provide information regarding blinking check engine light on dashboard what does the green oil light indicate |
| wash_window | front, rear | clean all back glass mist my front windshield |
| washer_fluid_level | | check washer fluid level |
## Templates
Templates add functionality to recognizer models. This includes running models
simultaneously or sequentially, and adding VAD audio gating.
See [template types](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) for an overview.
### tpl-spot-concurrent-1.5.0.snsr
Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models at the same time.
**Also see these related items:** [tpl-spot-concurrent](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-concurrent.md#tpl-spot-concurrent-type)
### tpl-spot-debug-1.5.1.snsr
Adds runtime data collection to a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) model.
**Also see these related items:** [tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug.md#tpl-spot-debug-type)
### tpl-spot-select-1.4.0.snsr
Dynamically selects which of the two embedded [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type)
models to run.
**Also see these related items:** [tpl-spot-select](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select.md#tpl-spot-select-type)
### tpl-spot-sequential-1.5.0.snsr
Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models one after the other, with optional
looping on the second. Includes push-to-talk as an alternative to the wake word.
**Also see these related items:** [tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type)
### tpl-spot-vad-3.13.0.snsr
Runs a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) until it spots, then does start-
and endpoint detection on the subsequent audio stream using
a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type).
**Also see these related items:** [tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type)
### tpl-opt-spot-vad-lvcsr-1.28.0.snsr _(TrulyNatural only)_
Optionally runs a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) until it spots, segments the subsequent
audio stream with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), then sends the segmented audio to
an [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer. You can select at runtime
whether recognition waits on the wake word or starts immediately.
**Also see these related items:** [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type)
### tpl-spot-vad-lvcsr-3.23.0.snsr _(TrulyNatural only)_
Runs a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) until it spots, segments the subsequent
audio stream with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), then sends the segmented audio to
an [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer.
**Also see these related items:** [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type)
### tpl-vad-lvcsr-3.17.0.snsr
Detects speech with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to
an [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer.
**Also see these related items:** [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type)
[THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation"
*[API]: Application Programming Interface
*[EFT]: Enrolled Fixed Trigger: fixed wake words adapted to a speaker to improve accuracy
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[OSS]: Open-source software
*[PNC]: Punctuation and Capitalization, an STT model variant that emits cased text with punctuation
*[SDK]: Software Development Kit
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[UDT]: User-Defined Trigger: enrolled wake words and command sets
*[VAD]: Voice Activity Detector
---
source_path: "models/tpl/index.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/"
---
# Templates
Task templates are models that use composition to add behavior to
[basic model types](https://doc.sensory.com/tnl/7.8/models/types/index.md#model-types). Templates have _slots_ that you can fill with any
model that has a [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) that matches what the slot expects.
The [tpl-spot-vad-lvscr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type) template, for example,
waits for the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type)
in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), then runs the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) model in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
The composed model has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) `==` [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) and implements all of the
events and settings expected of such a model type. You can use it as a drop-in replacement
for a wake word in (say) [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spotc) without any code changes.
Compose new template-based models with [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit), on the fly with [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval), or
by using the [setStream](https://doc.sensory.com/tnl/7.8/api/inference.md#setters) function at runtime.
## Composed models
[tpl-spot-concurrent](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-concurrent.md#tpl-spot-concurrent-type)
- Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models at the same time. It provides a convenient way to create a single
wake word model that has the combined vocabulary of two other models.
[tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug.md#tpl-spot-debug-type)
- Adds runtime data collection to a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) model. Use this to collect audio and event timings
from an embedded model, [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) to extract audio, event logs, and the model itself from the generated
log file, and [audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check) to verify audio recording quality.
[tpl-spot-select](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select.md#tpl-spot-select-type)
- Allows you to dynamically select which of the two embedded [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models to run.
[tpl-spot-sequential](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential.md#tpl-spot-sequential-type)
- Runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models in sequence. Use this to listen for a trigger phrase followed by
a command, for example: "Voice genie, play music."
[tpl-spot-vad](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad.md#tpl-spot-vad-type)
- Runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, then does start- and endpoint detection
with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) on the audio stream following the wake word.
[tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type) _(TrulyNatural only)_
- _Optionally_ runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, then segments the audio
following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or
[STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
[tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type) _(TrulyNatural only)_
- Runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) until it detects, segments the audio following the wake
word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type), and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type)
recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
[tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type) _(TrulyNatural only)_
- Detects speech with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type) and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type)
or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) recognizer in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0).
*[API]: Application Programming Interface
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/tpl/tpl-opt-spot-vad-lvcsr.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr/"
---
# tpl-opt-spot-vad-lvcsr _(TrulyNatural only)_
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) _optionally_ runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0)
until it detects, then segments the audio following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type)
and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type)
recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
[slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) controls whether `tpl-opt-spot-vad-lvcsr` waits for the wake word:
* With `slot == 0` it waits for the wake word before starting the VAD.
In this mode the behavior is that of [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type).
* With `slot == 1` starts the VAD immediately and the behavior is that of [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type).
You can change [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) at runtime. Use this to gate only the first of a series of commands
with a wake word.
`tpl-spot-vad-lvcsr` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
* **Slot 1:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr)
**Also see these related items:** [tpl-opt-spot-vad-lvcsr-1.28.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr), [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr.md#tpl-spot-vad-lvcsr-type), [tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type)
## Operation
```mermaid
flowchart TD
start((start))
slotCheck0{slot == 0?}
start --> slotCheck0
slotCheck0 -->|yes| startWW
slotCheck0 -->|no| fetch0
subgraph slot0[slot 0 (phrasespot)]
startWW((start))
fetchWW[/samples from ->audio-pcm/]
audioWW(^sample-count)
processWW[process]
result(0.^result)
stopWW((stop))
startWW --> fetchWW
fetchWW --> audioWW
audioWW --> processWW
processWW --> fetchWW
processWW -->|recognize| result
result --> stopWW
end
subgraph slot1[slot 1 (lvcsr)]
startSTT((start))
startSTTfinal((start))
stopSTT((stop))
stopSTTpartial((stop))
processSTT[process]
partialSTT(^result-partial)
intentSTT(^nlu-intent)
slotSTT(^nlu-slot)
resultSTT(^result)
nluSTT{NLU match?}
slmSTT{SLM included?}
generateSTT[generate]
slmstartSTT(^slm-start)
slmresultpartialSTT(^slm-result-partial)
slmresultSTT(^slm-result)
startSTT --> processSTT
processSTT ---->|hypothesis| partialSTT
partialSTT --> stopSTTpartial
startSTTfinal --> nluSTT
nluSTT -->|yes| intentSTT
nluSTT -->|no| resultSTT
intentSTT --> slotSTT
slotSTT --> resultSTT
slotSTT -->|more| intentSTT
resultSTT --> slmSTT
slmSTT -->|yes| slmstartSTT
slmSTT -->|no| stopSTT
slmstartSTT -->|OK| generateSTT
slmstartSTT -->|STOP| stopSTT
generateSTT -->|response| slmresultpartialSTT
slmresultpartialSTT --> generateSTT
generateSTT -->|done| slmresultSTT
slmresultSTT --> stopSTT
end
listenBegin(^listen-begin)
listenEnd(^listen-end)
stopWW --> listenBegin
listenBegin --> fetch0
fetch0[/samples from ->audio-pcm/]
fetch1[/samples from ->audio-pcm/]
audio0(^sample-count)
audio1(^sample-count)
silence(^silence)
begin(^begin)
END(^end)
limit(^limit)
process0[VAD process]
process1[VAD process]
final@{ shape: f-circ }
slotCheck1{slot == 0?}
fetch0 --> audio0
audio0 --> process0
process0 --> fetch0
process0 -->|speech start| begin
process0 -->|timeout| silence
silence ~~~ final
silence --> slotCheck1
begin --> fetch1
fetch1 --> audio1
audio1 --> process1
process1 --> startSTT
stopSTTpartial --> fetch1
process1 -->|speech end| END
process1 -->|speech limit| limit
END --> final
limit --> final
final --> startSTTfinal
stopSTT --> slotCheck1
slotCheck1 -->|no| fetch0
slotCheck1 -->|yes| listenEnd
listenEnd --> startWW
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. If processing does not detect a wake word, continue at step 1.
4. Invoke [0.^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) for the wake word.
5. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) and start VAD processing.
6. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
7. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
8. If VAD processing does not detect the start of speech within the [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) timeout, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and continue at step 15.
9. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin) if processing detects the start of speech, else continue at step 6.
10. Read audio date from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
11. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
12. If VAD processing detects an endpoint invoke either [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) or [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and continue at step 14.
13. Process VAD segmented audio in the LVCSR or STT recognizer
* Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition result hypothesis.
* Continue at step 10.
14. Produce a final LVCSR or STT recognition hypothesis.
* Invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each NLU intent found.
* Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis.
* If there's no SLM, continue at step 15.
* Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), if the callback returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), continue at step 15.
* Generate SLM result, invoking [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) on each generated token.
* Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) with complete SLM result.
15. Invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and start listening for the wake word again at step 1.
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backlog-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backlog-interval), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end)
**Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code)
## Notes
Use this template for command and control type applications where commands are
initiated with a wake word in certain contexts and not in others.
Set [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot)`= 1` in the [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) handler, and [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot)`= 0` in the
[^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) handler. With this configuration the recognizer requires a wake word to start
listening only for the first in a series of interactions. After this it will revert to requiring
a wake word only if the user does not say anything for at least [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms.
VAD settings [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), and [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence)
apply to both slot 0 and slot 1, but [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence) applies only to slot 0.
Set [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio)` = 1` to include the wake word audio in the
samples passed to the LVCSR or STT recognizer. STT hypotheses do not include the wake word
text unless Sensory specifically configured the model to do so.
The [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) and [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) events are for the LVCSR or STT recognizer
in slot 1. If you need direct access to the wake word result, prefix the event
with the slot path: `0.^result` Use the slot prefix to read values in the `0.^result` event handler too, for example call [getString](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) with key [0.text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) to read the wake word transcription.
## Examples
### Select wake-word or VAD-only behavior
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o opt-vg-stt.snsr\
-t model/tpl-opt-spot-vad-lvcsr-1.28.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr\
-s include-wake-word-audio=1
# Say "Voice genie, open the sunroof."
% snsr-eval -vt opt-vg-stt.snsr
Using live audio from default capture device. ^C to stop.
P 33010 33490 (0.3201) Open the sun
P 33050 33890 (0.7712) Open the sunroof
32010 34185 [^end] VAD speech region.
NLU intent: open_window (0.9956) = open the sunroof
NLU entity: roof (0.9595) = sunroof
33050 33890 (0.5731) Open the sunroof.
^C
# Select the VAD-only path with slot=1
# Say "Close all the windows"
% snsr-eval -vt opt-vg-stt.snsr -s slot=1
Using live audio from default capture device. ^C to stop.
P 2150 2670 (0.257) Clothes. All
P 2190 3150 (0.7631) Close. All the wind
P 2190 3430 (0.9899) Close all the windows
1950 3855 [^end] VAD speech region.
NLU intent: close_window (0.9977) = close all the windows
2190 3470 (0.9244) Close all the windows.
^C
```
### Use trailing wake-word
Recognize a phrase with the wake word at either end of an utterance.
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o opt-vg-stt-vg.snsr\
-t model/tpl-opt-spot-vad-lvcsr-1.28.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr\
-s include-wake-word-audio=1\
-s wake-word-at-end=1
# Say "Voice genie, set the radio to 91.5 FM."
% bin/snsr-eval -vt opt-vg-stt-vg.snsr
Using live audio from default capture device. ^C to stop.
P 4360 5000 (0.2927) Set. The radio
P 4400 5280 (5.7e-07) Set the radio to n
P 4400 5760 (0.7336) Set the radio to ninety-one
P 4400 6120 (0.6005) Set the radio to ninety one point
P 4400 6440 (0.5195) Set the radio to ninety one point. Five
P 4400 6480 (0.6733) Set the radio to ninety one point. Five
3405 7455 [^end] VAD speech region.
NLU intent: set_radio (0.9674) = set the radio to 91.5 FM
NLU entity: radio_station (0.9688) = 91.5 FM
4400 7080 (0.3896) Set the radio to ninety one point. Five F. M.
15225 17490 [^end] VAD speech region.
# Say "Will it rain in Portland tomorrow, Voice Genie?"
NLU intent: no_command (0.9977) = will it rain in portland tomorrow
NLU entity: time (0.9773) = tomorrow
15460 17260 (0.6731) Will it rain in Portland tomorrow?
^C
```
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[SLM]: Generative Small Language Model
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/tpl/tpl-spot-concurrent.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-concurrent/"
---
# tpl-spot-concurrent
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models at
the same time. It provides a convenient way to create a single wake word model
that has the combined vocabulary of two other models.
`tpl-spot-concurrent` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
* **Slot 1:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [tpl-spot-concurrent-1.5.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-concurrent)
## Operation
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
split@{ shape: f-circ }
join@{ shape: f-circ }
start --> fetch
fetch --> audio
audio --> split
split --> start0
split --> start1
end0 --> join
end1 --> join
join ----> fetch
subgraph slot0[slot 0 (phrasespot)]
start0((start))
process0[process]
result0(^result)
end0((stop))
start0 --> process0
process0 --> end0
process0 -->|recognize| result0
result0 --> end0
end
subgraph slot1[slot 1 (phrasespot)]
start1((start))
process1[process]
result1(^result)
end1((stop))
start1 --> process1
process1 --> end1
process1 -->|recognize| result1
result1 --> end1
end
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Send audio samples to recognizers in slot 0 and slot 1.
4. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase.
5. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** _none_
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to)
**Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second)
**Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
## Notes
Runs the wake word models in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) and slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1) at the same
time, in the same thread.
The two recognizers are entirely independent, and can produce
results that overlap in time. For production use Sensory recommends
custom multi-phrase wake word recognizers instead. These have improved
false reject / false accept performance.
The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), and can be used
in any application that expects such a model without API changes.
Configuration settings and iterators are not available in the
combined model. You can access these for the individual models
by prefixing the setting path with the slot. For example,
use `0.operating-point` to read or change the [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) of the first spotter
and use `1.operating-point` to read or change the [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) of the second spotter.
Attempting to set [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) without a slot prefix will result in an error.
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vg-hbg.snsr\
-t model/tpl-spot-concurrent-1.5.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/spot-hbg-enUS-1.4.0-m.snsr
% bin/snsr-edit \
-t vg-hbg.snsr \
-s operating-point=5
Setting "operating-point" not found, did you mean "0.operating-point" or "1.operating-point"?
```
Change individual settings at runtime by prefixing the
setting name with the slot:
**C/C++**
```c
/* Set the operating point for spotter 0 only. */
snsrSetInt(session, SNSR_SLOT_0 SNSR_OPERATING_POINT, 7);
```
**Java**
```java
/* Set the operating point for spotter 0 only. */
session.setInt(Snsr.SLOT_0 + Snsr.OPERATING_POINT, 7);
```
**Python**
```python
# Set the operating point for spotter 0 only.
session.set_int(snsr.SLOT_0 + snsr.OPERATING_POINT, 7)
```
You can recombine combined models and nest them to an arbitrary depth to
run any number[^1] of wake word recognizers at the same time:
[^1]: Limited only by available RAM and CPU.
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o models1-4.snsr\
-t model/tpl-spot-concurrent-1.5.0.snsr\
-f 0 model/tpl-spot-concurrent-1.5.0.snsr\
-f 0.0 model-1.snsr\
-f 0.1 model-2.snsr\
-f 1 model/tpl-spot-concurrent-1.5.0.snsr\
-f 1.0 model-1.snsr\
-f 1.1 model-2.snsr
```
In this example, the four wake word models are located in the `0.0`, `0.1`,
`1.0` and `1.1` slots
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vg-hbg.snsr\
-t model/tpl-spot-concurrent-1.5.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/spot-hbg-enUS-1.4.0-m.snsr
% bin/snsr-eval -t vg-hbg.snsr
2370 2940 voicegenie
5805 6420 voicegenie
7740 8640 hello blue genie
10440 11100 voicegenie
12060 12870 hello blue genie
^C
```
*[API]: Application Programming Interface
*[RAM]: Random Access Memory
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "models/tpl/tpl-spot-debug.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-debug/"
---
# tpl-spot-debug
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) adds runtime data collection to a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type)
model. Use this to collect audio and event timings from an embedded model,
[snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) to extract audio, event logs, and the model itself from the generated
log file, and [audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check) to verify audio recording quality.
`tpl-spot-debug` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [tpl-spot-debug-1.5.1.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-debug)
## Operation
```mermaid
flowchart TD
start0((start))
log@{ shape: doc, label: "debug-log-file" }
start0 --> start
slot0 -.-> log
subgraph slot0[slot 0 (phrasespot)]
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
process
result(^result)
start --> fetch
fetch --> audio
audio --> process
process --> fetch
process -->|recognize| result
result --> fetch
end
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase.
4. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream)
**Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [debug-log-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#debug-log-file), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [include-model](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-model), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold)
**Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
## Notes
You must specify the name of the log file with [debug-log-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#debug-log-file).
[include-model](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-model) controls whether the log file includes a copy
of the original task model. This is enabled by default.
The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) that you can use as a
drop-in replacement for the original wake word without any API changes.
Log files include time-stamped entries with:
* SDK library information,
* the spotter model being used,
* audio samples, and
* event callbacks.
Extract text, model and audio data from the log file with the [snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split) utility.
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o hbg-debug.snsr\
-t model/tpl-spot-debug-1.5.1.snsr\
-f 0 model/spot-hbg-enUS-1.4.0-m.snsr\
-s debug-log-file=hbg-debug.snsrlog
% bin/snsr-eval -t hbg-debug.snsr
2925 3690 hello blue genie
4995 5790 hello blue genie
7920 8640 hello blue genie
^C
# The error below is harmless and expected when you
# interrupt snsr-eval with ^C
% bin/snsr-log-split -vv hbg-debug.snsrlog
Writing to './'
Processing hbg-debug.snsrlog
-> audio ./hbg-debug.wav
-> event ./hbg-debug.txt
-> model ./hbg-debug.snsr
Error: Input file "hbg-debug.snsrlog" is truncated.
Processed 1273 items.
```
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[SDK]: Software Development Kit
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "models/tpl/tpl-spot-select.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-select/"
---
# tpl-spot-select
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) allows you to dynamically select which of the
two embedded [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models to run.
`tpl-spot-select` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
* **Slot 1:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [tpl-spot-select-1.4.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-select)
## Operation
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
join@{ shape: f-circ }
start --> fetch
fetch --> audio
audio -->|slot == 0| start0
audio -->|slot == 1| start1
end0 --> join
end1 --> join
join ----> fetch
subgraph slot0[slot 0 (phrasespot)]
start0((start))
process0[process]
result0(^result)
end0((stop))
start0 --> process0
process0 --> end0
process0 -->|recognize| result0
result0 --> end0
end
subgraph slot1[slot 1 (phrasespot)]
start1((start))
process1[process]
result1(^result)
end1((stop))
start1 --> process1
process1 --> end1
process1 -->|recognize| result1
result1 --> end1
end
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Send audio to the recognizer specified by [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot).
5. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase.
6. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^adapt-started](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapt-started), [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** _none_
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user)
**Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot)
**Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
## Notes
Use [slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#slot) to select either the spotter in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0) or slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
Use this template to reduce the model size when an application uses variants
of the same recognizer in different contexts. This reduces the overall model
size and RAM requirements as identical objects are shared between the slots.
The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), and can be used
in any application that expects such a model without API changes.
Configuration settings and iterators are not available in the
combined model. You can access these for the individual models
by prefixing the setting path with the slot. For example,
use `1.operating-point` to read or change the [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point)
of the second spotter.
Change individual settings at runtime by prefixing the
setting name with the slot:
**C/C++**
```c
/* Set the operating point for spotter 0 only. */
snsrSetInt(session, SNSR_SLOT_0 SNSR_OPERATING_POINT, 7);
```
**Java**
```java
/* Set the operating point for spotter 0 only. */
session.setInt(Snsr.SLOT_0 + Snsr.OPERATING_POINT, 7);
```
**Python**
```python
# Set the operating point for spotter 0 only.
session.set_int(snsr.SLOT_0 + snsr.OPERATING_POINT, 7)
```
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vg-hbg-select.snsr\
-t model/tpl-spot-select-1.4.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/spot-hbg-enUS-1.4.0-m.snsr
# repeat "hello blue genie" and "voice genie"
% bin/snsr-eval -t vg-hbg-select.snsr -s slot=0
3480 4140 voicegenie
9945 10545 voicegenie
^C
# repeat "hello blue genie" and "voice genie"
% bin/snsr-eval -t vg-hbg-select.snsr -s slot=1
1635 2460 hello blue genie
6210 6870 hello blue genie
^C
```
*[API]: Application Programming Interface
*[RAM]: Random Access Memory
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "models/tpl/tpl-spot-sequential.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-sequential/"
---
# tpl-spot-sequential
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs two [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models
in sequence. Use this to listen for a trigger phrase followed by a command,
for example: "Voice genie, play music."
`tpl-spot-sequential` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
* **Slot 1:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [tpl-spot-sequential-1.5.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-concurrent)
## Operation
```mermaid
flowchart TD
start((start))
loop0{loop == 2?}
start --> loop0
loop0 -->|no| start0
loop0 -->|yes| start1
subgraph slot0[slot 0 (phrasespot)]
start0((start))
fetch0[/samples from ->audio-pcm/]
audio0(^sample-count)
process0[process]
stop0((stop))
start0 --> fetch0
fetch0 --> audio0
audio0 --> process0
process0 --> fetch0
process0 -->|recognize| stop0
end
listenBegin(^listen-begin)
stop0 --> listenBegin
listenBegin --> start1
subgraph slot1[slot 1 (phrasespot)]
start1((start))
fetch1[/samples from ->audio-pcm/]
audio1(^sample-count)
process1[process]
result1(^result)
stop1((stop))
loop{loop == 0?}
loop2{loop == 2?}
start1 --> fetch1
fetch1 --> audio1
audio1 --> process1
process1 --> fetch1
process1 --->|recognize| result1
process1 -->|timeout| loop2
loop2 -->|no| stop1
loop2 -->|yes| fetch1
result1 --> loop
loop -->|no| fetch1
loop -->|yes| stop1
end
listenEnd(^listen-end)
stop1 --> listenEnd
listenEnd --> start0
```
Operation flow.
1. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 2` skip to step 6.
2. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
3. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
4. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase, else continue at step 2.
5. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), then start the wake word in slot 1.
6. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
7. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
8. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `!= 2` and processing does not detect a wake word within
[listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and restart the slot 0 wake word at step 2.
9. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase, else continue at step 6.
10. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 0` invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and continue at step 2.
11. If [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `!= 0` reset the listen-window timeout and continue processing
at step 6.
12. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream)
**Available configuration settings:** [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold)
**Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
## Notes
With [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 0` (the default): This template
runs the spotter in slot `0` until it spots, then runs slot `1`
until it spots, or the [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window) timeout expires, then
returns to the spotter in slot `0`.
With [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 1`: This
runs the spotter in slot `0` until it spots, then runs slot `1`
until the [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window) timeout expires, then returns to
the spotter in slot `0`. It resets the expiration timer every time
slot `1` recognizes.
With [loop](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#loop) `== 2`: The template runs only slot `1`.
If your application needs to listen for a wake word but also support
an external trigger, such as a push-to-talk button, set `loop=2`
when the event occurs.
The combined model is a [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) and can be used
in any application that expects those without code changes.
Combined model settings refer to the model in slot `1`,
so [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point) refers to `1.operating-point`.
You can change settings for the wake word in slot `0`
by prefixing the setting name with [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0), for example:
`0.operating-point`.
The model invokes [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) just before audio focus switches
to slot 1, and [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) before audio focus switches back to
slot 0. If there's no [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) between `^listen-begin` and `^listen-end`
it is because the recognizer in slot 1 timed out.
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vg-music.snsr\
-t model/tpl-spot-sequential-1.5.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/spot-music-enUS-1.2.0-m.snsr
# say "voice genie, play music"
% bin/snsr-eval -vvt vg-music.snsr
Using live audio from default capture device. ^C to stop.
Using operating point 17.
Available operating points: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20.
Available vocabulary:
1: "play_music"
2: "previous_song"
3: "stop_music"
4: "next_song"
5: "pause_music"
3180 [^listen-begin]
phrase:
3630 4410 (1 sv) play_music
words:
3630 3900 (1 sv)
3900 4410 (1 sv) play_music
4635 [^listen-end]
^C
```
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "models/tpl/tpl-spot-vad-lvcsr.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad-lvcsr/"
---
# tpl-spot-vad-lvcsr _(TrulyNatural only)_
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0)
until it detects, segments the audio following the wake word with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type),
and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type)
recognizer in slot [1](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#1).
This behavior is also available in the [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type)
template, which adds an option to skip the wake word.
`tpl-spot-vad-lvcsr` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
* **Slot 1:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr)
**Also see these related items:** [tpl-spot-vad-lvcsr-3.23.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type)
## Operation
```mermaid
flowchart TD
start((start))
start --> startWW
subgraph slot0[slot 0 (phrasespot)]
startWW((start))
fetchWW[/samples from ->audio-pcm/]
audioWW(^sample-count)
processWW[process]
result(0.^result)
stopWW((stop))
startWW --> fetchWW
fetchWW --> audioWW
audioWW --> processWW
processWW --> fetchWW
processWW -->|recognize| result
result --> stopWW
end
subgraph slot1[slot 1 (lvcsr)]
startSTT((start))
startSTTfinal((start))
stopSTT((stop))
stopSTTpartial((stop))
processSTT[process]
partialSTT(^result-partial)
intentSTT(^nlu-intent)
slotSTT(^nlu-slot)
resultSTT(^result)
nluSTT{NLU match?}
slmSTT{SLM included?}
generateSTT[generate]
slmstartSTT(^slm-start)
slmresultpartialSTT(^slm-result-partial)
slmresultSTT(^slm-result)
startSTT --> processSTT
processSTT ---->|hypothesis| partialSTT
partialSTT --> stopSTTpartial
startSTTfinal --> nluSTT
nluSTT -->|yes| intentSTT
nluSTT -->|no| resultSTT
intentSTT --> slotSTT
slotSTT --> resultSTT
slotSTT -->|more| intentSTT
resultSTT --> slmSTT
slmSTT -->|yes| slmstartSTT
slmSTT -->|no| stopSTT
slmstartSTT -->|OK| generateSTT
slmstartSTT -->|STOP| stopSTT
generateSTT -->|response| slmresultpartialSTT
slmresultpartialSTT --> generateSTT
generateSTT -->|done| slmresultSTT
slmresultSTT --> stopSTT
end
listenBegin(^listen-begin)
listenEnd(^listen-end)
stopWW --> listenBegin
listenBegin --> fetch0
fetch0[/samples from ->audio-pcm/]
fetch1[/samples from ->audio-pcm/]
audio0(^sample-count)
audio1(^sample-count)
silence(^silence)
begin(^begin)
END(^end)
limit(^limit)
process0[VAD process]
process1[VAD process]
final@{ shape: f-circ }
fetch0 --> audio0
audio0 --> process0
process0 --> fetch0
process0 -->|speech start| begin
process0 -->|timeout| silence
silence ~~~ final
silence --> listenEnd
begin --> fetch1
fetch1 --> audio1
audio1 --> process1
process1 --> startSTT
stopSTTpartial --> fetch1
process1 -->|speech end| END
process1 -->|speech limit| limit
END --> final
limit --> final
final --> startSTTfinal
stopSTT --> listenEnd
listenEnd --> startWW
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. If processing does not detect a wake word, continue at step 1.
4. Invoke [0.^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) for the wake word.
5. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin) and start VAD processing.
6. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
7. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
8. If VAD processing does not detect the start of speech within the [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) timeout, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and continue at step 15.
9. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin) if processing detects the start of speech, else continue at step 6.
10. Read audio date from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
11. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
12. If VAD processing detects an endpoint invoke either [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) or [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and continue at step 14.
13. Process VAD segmented audio in the LVCSR or STT recognizer
* Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition result hypothesis.
* Continue at step 10.
14. Produce a final LVCSR or STT recognition hypothesis.
* Invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each NLU intent found.
* Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis.
* If there's no SLM, continue at step 15.
* Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), if the callback returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), continue at step 15.
* Generate SLM result, invoking [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) on each generated token.
* Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) with complete SLM result.
15. Invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end) and start listening for the wake word again at step 1.
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backlog-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backlog-interval), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end)
**Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code)
## Notes
Use this template for command and control type applications where commands are
initiated with a wake word.
The [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) and [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) events are for the LVCSR or STT recognizer
in slot 1. If you need direct access to the wake word result, prefix the event
with the slot path: `0.^result` Use the slot prefix to read values in the `0.^result` event handler too, for example call [getString](https://doc.sensory.com/tnl/7.8/api/inference.md#getters) with key [0.text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) to read the wake word transcription.
Set [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio)` = 1` to include the wake word audio in the
samples passed to the LVCSR or STT recognizer. STT hypotheses do not include the wake word
text unless Sensory specifically configured the model to do so.
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vg-stt.snsr\
-t model/tpl-spot-vad-lvcsr-3.23.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-f 1 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr\
-s include-wake-word-audio=1
# Say "Voice genie, open the sunroof."
% snsr-eval -vt vg-stt.snsr
Using live audio from default capture device. ^C to stop.
P 2770 3250 (0.4166) Open the sun
P 2810 3650 (0.7161) Open the sunroof
1815 3990 [^end] VAD speech region.
NLU intent: open_window (0.9956) = open the sunroof
NLU entity: roof (0.9595) = sunroof
2810 3690 (0.4394) Open the sunroof.
^C
```
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[SLM]: Generative Small Language Model
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/tpl/tpl-spot-vad.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-spot-vad/"
---
# tpl-spot-vad
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) runs the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0)
until it detects, then does start- and endpoint detection with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type)
on the audio stream following the wake word.
`tpl-spot-vad` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot-vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot-vad).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [tpl-spot-vad-3.13.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad)
## Operation
```mermaid
flowchart TD
start((start))
start --> startWW
subgraph slot0[slot 0 (phrasespot)]
startWW((start))
fetchWW[/samples from ->audio-pcm/]
audioWW(^sample-count)
processWW[process]
result(^result)
stopWW((stop))
startWW --> fetchWW
fetchWW --> audioWW
audioWW --> processWW
processWW --> fetchWW
processWW -->|recognize| result
result --> stopWW
end
listenBegin(^listen-begin)
listenEnd(^listen-end)
stopWW --> listenBegin
listenBegin --> fetch0
fetch0[/samples from ->audio-pcm/]
fetch1[/samples from ->audio-pcm/]
audio0(^sample-count)
audio1(^sample-count)
silence(^silence)
begin(^begin)
END(^end)
limit(^limit)
process0[process]
process1[process]
out[\samples to <-audio-pcm\]
final@{ shape: f-circ }
fetch0 --> audio0
audio0 --> process0
process0 --> fetch0
process0 -->|speech start| begin
process0 -->|timeout| silence
silence --> final
begin --> fetch1
fetch1 --> audio1
audio1 --> out
out --> process1
process1 --> fetch1
process1 -->|speech end| END
process1 -->|speech limit| limit
END --> final
limit --> final
final --> listenEnd
listenEnd --> startWW
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. If processing detects a vocabulary phrase, skip to step 5.
4. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
5. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result)
6. Invoke [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin)
7. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
8. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
9. If speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms continue at step 12.
10. If _no_ speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence)
and skip to step 19.
11. Continue processing at step 7 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).
12. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin).
13. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
14. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
15. If [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `== 1` write speech samples to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out).
16. If end detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end)
and skip to step 19.
17. If end _not_ detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit)
and skip to step 19.
18. Continue processing at step 13 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).
19. Invoke [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end)
20. Restart at step 1.
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^listen-begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-begin), [^listen-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#listen-end), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms), [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold), [wake-word-at-end](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#wake-word-at-end)
**Available values:** [phrasespot-vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot-vad)
**Also see these related items:** [live-segment.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment.md#live-segment-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
## Notes
Use this for wake-word gated audio sent to cloud engines.
Set [include-wake-word-audio](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-wake-word-audio) `= 1` to include the wake word audio in the
VAD audio output stream.
This template writes the VAD-segmented audio to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out).
If your application does not use this, set [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `= 0`.
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vg-vad.snsr\
-t model/tpl-spot-vad-3.13.0.snsr\
-f 0 model/spot-voicegenie-enUS-6.5.1-m.snsr\
-s include-wake-word-audio=1
# Say "Voice genie, what's the capital of Oregon?"
% bin/snsr-eval -o vad-audio.wav -vvt vg-vad.snsr
Using live audio from default capture device. ^C to stop.
Using operating point 8.
Available operating points: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21.
Available vocabulary:
1: "voicegenie"
phrase:
1950 2550 (1) voicegenie
words:
1950 2550 (1) voicegenie
2730 [^listen-begin]
2730 [^begin]
1650 4200 [^end] VAD speech region.
4980 [^listen-end]
^C
```
Review _vad-audio.wav_: the recording starts [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff) ms before the
the beginning of "voice genie" and continues until [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over) ms after the end
of the utterance.
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/tpl/tpl-vad-lvcsr.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr/"
---
# tpl-vad-lvcsr _(TrulyNatural only)_
This [template](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) detects speech with a [VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type)
and sends the segmented audio to the [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type)
recognizer in slot [0](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#0).
`tpl-vad-lvcsr` has [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot).
Expected [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type):
* **Slot 0:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr)
**Also see these related items:** [tpl-vad-lvcsr-3.17.0.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-opt-spot-vad-lvcsr.md#tpl-opt-spot-vad-lvcsr-type)
## Operation
```mermaid
flowchart TD
start((start))
start --> fetch0
subgraph slot0[slot 0 (lvcsr)]
startSTT((start))
startSTTfinal((start))
stopSTT((stop))
stopSTTpartial((stop))
processSTT[process]
partialSTT(^result-partial)
intentSTT(^nlu-intent)
slotSTT(^nlu-slot)
resultSTT(^result)
nluSTT{NLU match?}
slmSTT{SLM included?}
generateSTT[generate]
slmstartSTT(^slm-start)
slmresultpartialSTT(^slm-result-partial)
slmresultSTT(^slm-result)
startSTT --> processSTT
processSTT ---->|hypothesis| partialSTT
partialSTT --> stopSTTpartial
startSTTfinal --> nluSTT
nluSTT -->|yes| intentSTT
nluSTT -->|no| resultSTT
intentSTT --> slotSTT
slotSTT --> resultSTT
slotSTT -->|more| intentSTT
resultSTT --> slmSTT
slmSTT -->|yes| slmstartSTT
slmSTT -->|no| stopSTT
slmstartSTT -->|OK| generateSTT
slmstartSTT -->|STOP| stopSTT
generateSTT -->|response| slmresultpartialSTT
slmresultpartialSTT --> generateSTT
generateSTT -->|done| slmresultSTT
slmresultSTT --> stopSTT
end
fetch0[/samples from ->audio-pcm/]
fetch1[/samples from ->audio-pcm/]
audio0(^sample-count)
audio1(^sample-count)
silence(^silence)
begin(^begin)
END(^end)
limit(^limit)
process0[VAD process]
process1[VAD process]
final@{ shape: f-circ }
listenEnd@{ shape: f-circ }
fetch0 --> audio0
audio0 --> process0
process0 --> fetch0
process0 -->|speech start| begin
process0 -->|timeout| silence
silence ~~~ final
silence --> listenEnd
begin --> fetch1
fetch1 --> audio1
audio1 --> process1
process1 --> startSTT
stopSTTpartial --> fetch1
process1 -->|speech end| END
process1 -->|speech limit| limit
END --> final
limit --> final
final --> startSTTfinal
stopSTT --> listenEnd
listenEnd ----> fetch0
```
Operation flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. If VAD processing does not detect the start of speech within the [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) timeout, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence) and continue at step 1.
4. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin) if processing detects the start of speech, else continue at step 1.
5. Read audio date from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
6. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
7. If VAD processing detects an endpoint invoke either [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit) or [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end) and continue at step 9.
8. Process VAD segmented audio in the LVCSR or STT recognizer
* Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition result hypothesis.
* Continue at step 5.
9. Produce a final LVCSR or STT recognition hypothesis.
* Invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each NLU intent found.
* Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis.
* If there's no SLM, continue at step 1.
* Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start), if the callback returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop), continue at step 1.
* Generate SLM result, invoking [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) on each generated token.
* Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) with complete SLM result.
* Continue at step 1.
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start)
**Available iterators:** _none_
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile)
**Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr), [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code)
## Notes
Use this template for command and control type applications where commands are initiated
just by speaking.
## Examples
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -o vad-stt.snsr\
-t model/tpl-vad-lvcsr-3.17.0.snsr\
-f 0 model/stt-enUS-automotive-medium-2.3.15-pnc.snsr
# Say, for example: "Turn the air conditioning up all the way"
% snsr-eval -t vad-stt.snsr
P 1000 1040 T
P 1000 1600 Turn the egg
P 1040 2040 Turn the air conditioner
P 1040 2320 Turn the air conditioning up
P 1040 2760 Turn the air conditioning up all the way
NLU intent: set_fan (0.9547) = turn the air conditioning up 100%
NLU entity: hvac (0.9744) = air conditioning
NLU entity: percentage_value (0.8963) = 100%
1040 2880 Turn the air conditioning up all the way.
^C
```
*[API]: Application Programming Interface
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[SLM]: Generative Small Language Model
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/types/ca.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/ca/"
---
# Adapting wake word
These are fixed wake word models that continuously
adapt to speakers' voices to improve false-accept rates.
They are drop-in replacements for fixed wake words.
Continuously adapting wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
and filenames that by convention match `ca-*.snsr`
**Also see these related items:** [Adapting wake word models](https://doc.sensory.com/tnl/7.8/models/index.md#ca-models) included in this distribution.
## Operation
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
process
result(^result)
adaptStarted(^adapt-started)
adapted(^adapted)
newUser(^new-user)
start --> fetch
fetch --> audio
audio --> process
process --> fetch
process -->|recognize| result
process -->|recognize w/ high SNR| adaptStarted
adaptStarted --> result
result --> fetch
fetch -->|adapted| adapted
adapted --> fetch
adapted -->|new user identified| newUser
newUser --> fetch
```
Recognition flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Invoke [^adapt-started](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapt-started) if processing detects a vocabulary phrase in a low-noise environment. This starts adapting the model to the speaker's voice on a background thread.
4. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase.
5. Invoke [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted) when the background thread has finished adding an enrollment.
* Invoke [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user) if adaptation detects a user it hasn't seen before.
6. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^adapt-started](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapt-started), [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^new-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#new-user), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream), [rename-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#rename-user)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [cache-file](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#cache-file), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold)
**Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "models/types/enroll.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/enroll/"
---
# Wake word enrollment
These models provide user enrollment for EFT and UDT. They produce [wake-word models](https://doc.sensory.com/tnl/7.8/models/index.md#wake-word-models).
Enrollment models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll)
and filenames that by convention match `eft-*.snsr` or `udt-*.snsr`
**Also see these related items:** [wake word enrollment models](https://doc.sensory.com/tnl/7.8/models/index.md#enroll-models) included in this distribution.
## Operation
Wake word enrollment has two modes: [interactive](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-interactive)
for live recordings, and [offline](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-offline) for pre-recorded
enrollment audio.
### Interactive
With [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive) `= 1` enrollment tasks expect live audio and re-record enrollments
that cannot be used.
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
segment[segment audio]
check[check audio quality]
validate[check enrollment consistency]
resume(^resume)
next(^next)
pause(^pause)
pass(^pass)
fail0(^fail)
fail1(^fail)
enrolled(^enrolled)
progress(^progress)
adapted(^adapted)
done(^done)
zeroCount[count←0]
incrCount[count++]
start --> next
next --> zeroCount
zeroCount --> resume
resume --> fetch
fetch --> audio
audio --> segment
segment --> fetch
segment -->|endpoint| pause
pause --> check
check -->|good| pass
pass ---> incrCount
incrCount --> resume
pass -->|count == required| validate
validate -->|good| next
validate -->|bad| fail1
check -->|bad| fail0
fail0 --> resume
fail1 --> zeroCount
next -->|user == NULL| enrolled
enrolled --> enroll
enroll --> progress
progress --> enroll
enroll --->|complete| adapted
adapted --> done
done ~~~ validate
```
Interactive enrollment flow.
1. Invoke [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next).
* If the callback set [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) to `NULL`, start model adaptation at step 8.
2. Reset the enrollment `count` to `0`. This tracks the number of usable enrollments
for the current user or phrase.
3. Invoke [^resume](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#resume). Application should use this to restart audio recording.
4. Make an audio recording and segment it with a VAD (UDT) or a wake word (EFT).
* Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
* Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
* Process and repeat until a speech segment is found.
5. Invoke [^pause](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pause). Application should pause audio recording.
6. Check enrollment audio quality.
* If good, invoke [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass).
If [req-enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#req-enroll) enrollments remain, start validation at step 7,
else start the next recording at step 3.
* If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) and redo the current recording at step 3.
7. Validate all the enrollment recordings, checking for consistency.
* If good, start enrolling the next user or phrase at step 1.
* If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) and restart at step 2.
8. When all users / phrases are available, invoke [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled).
9. Train a new recognizer with the enrollments
* Call [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress) repeatedly until done.
10. Invoke [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted).
11. Invoke [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done).
**Also see these related items:** [live-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-enroll.md#live-enroll-code), [enrollUDT.java](https://doc.sensory.com/tnl/7.8/api/sample/java/enrollUDT.md#enrolludt-code), [Enroll.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-enroll)
### Offline
With [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive) `= 0` enrollment tasks expect pre-recorded audio and fails
if any of the enrollments cannot be used.
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
segment[segment audio]
check[check audio quality]
validate[check enrollment consistency]
next(^next)
pass(^pass)
fail0(^fail)
fail1(^fail)
enrolled(^enrolled)
progress(^progress)
adapted(^adapted)
done(^done)
skip[discard enrollment]
skipBad[discard bad enrollment]
zeroCount[count←0]
incrCount[count++]
user0{user == NULL?}
user1{user == NULL?}
start --> user0
user0 -->|yes| next
user0 -->|no| user1
next --> user1
user1 -->|no| zeroCount
zeroCount --> fetch
fetch --> audio
audio --> segment
segment --> fetch
segment -->|endpoint| check
check --->|good| pass
pass --> incrCount
incrCount --> fetch
validate --->|bad| fail1
fail1 --> skipBad
skipBad --> validate
validate --> user1
check -->|bad| fail0
fail0 --> skip
skip --> fetch
fetch -->|STREAM_END| validate
user1 ---->|yes| enrolled
enrolled --> enroll
enroll --> progress
progress --> enroll
enroll --->|complete| adapted
adapted --> done
```
Offline enrollment flow.
1. If [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) `== NULL`, invoke [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next). The application should set [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) before
starting enrollment, or do so in the [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next) callback.
2. If [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user) `== NULL`, start model adaptation at step 7.
3. Reset the enrollment `count` to `0`. This tracks the number of usable enrollments
for the current user or phrase.
4. Segment audio with a VAD (UDT) or a wake word (EFT).
* Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
* Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
* Process and repeat until a speech segment is found or [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).
* If [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end), validate at step 6.
5. Check enrollment audio quality.
* If good, invoke [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass) and keep the recording.
* If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail) and discard the recording.
* Start the next recording at step 4.
6. Validate all the enrollment recordings, checking for consistency.
* If no bad recordings remain, start enrolling the next user or phrase at step 2.
* If bad, invoke [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail), remove the recording and revalidate.
7. When all users / phrases are available, invoke [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled).
8. Train a new recognizer with the enrollments
* Call [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress) repeatedly until done.
9. Invoke [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted).
10. Invoke [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done).
**Also see these related items:** [spot-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-enroll.md#spot-enroll-code)
## Settings
**Available events:** [^adapted](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#adapted), [^done](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#done), [^enrolled](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#enrolled), [^fail](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#fail), [^next](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#next), [^pass](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pass), [^pause](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#pause), [^progress](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#progress), [^resume](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#resume), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** [enrollment-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#enrollment-iterator), [user-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#user-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** _none_
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [add-context](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#add-context), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [delete-user](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#delete-user), [re-adapt](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#re-adapt), [user](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#user)
**Available configuration settings:** [accuracy](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#accuracy), [enrollment-task-index](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#enrollment-task-index), [interactive](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#interactive), [req-enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#req-enroll)
**Available values:** [enroll](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code)
*[API]: Application Programming Interface
*[EFT]: Enrolled Fixed Trigger: fixed wake words adapted to a speaker to improve accuracy
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[UDT]: User-Defined Trigger: enrolled wake words and command sets
*[VAD]: Voice Activity Detector
---
source_path: "models/types/index.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/"
---
# Model types
The [type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type) of a model specifies the runtime behavior: what it does, which [setting keys](https://doc.sensory.com/tnl/7.8/api/setting-keys/index.md#setting-keys) it supports, and when it invokes [event](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#events) callbacks.
These SDKs support both [fundamental](https://doc.sensory.com/tnl/7.8/models/types/index.md#fundamental-models) and [composed](https://doc.sensory.com/tnl/7.8/models/types/index.md#composed-models) model types. Fundamental types include wake words, adapting wake words, models that create wake words through user enrollment, VAD, LVCSR, and STT. Templates add features by composition, combining multiple fundamental models into one.
The TrulyHandsfree SDK supports all wake word, wake word enrollment, and VAD models.
TrulyNatural (Lite) includes TrulyHandsfree and support for LVCSR. TrulyNatural STT includes TrulyNatural (Lite) and adds speech to text.
## Fundamental models
[Wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type)
- Fixed and enrolled wake words, and keyword spotted command sets.
[Adapting wake word](https://doc.sensory.com/tnl/7.8/models/types/ca.md#ca-type)
- Fixed wake word models that continuously adapt to speakers' voices to improve false-accept rates.
[Wake word enrollment](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type)
- Adapts fixed (EFT) and user-defined (UDT) wake words to speakers' voices, creating [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) models specific to these speakers.
[VAD](https://doc.sensory.com/tnl/7.8/models/types/vad.md#vad-type)
- Finds the start- and endpoints of speech segments in a stream of audio data.
[LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) _(TrulyNatural only)_
- These recognizers use a phonetic acoustic model and an FST vocabulary decoder. They are suitable for small to medium vocabulary tasks, but not for audio transcription
[STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) _(STT only)_
- Audio transcription with transformers.
## Composed models
[Templates](https://doc.sensory.com/tnl/7.8/models/tpl/index.md#template-type) add behavior to the fundamental model types listed above. Use these, for example, to create a single model that waits for a keyword, runs a VAD, and then recognizes the segmented speech with an STT recognizer. This composed model uses the same API as a simple wake word and does not require application code changes.
*[API]: Application Programming Interface
*[EFT]: Enrolled Fixed Trigger: fixed wake words adapted to a speaker to improve accuracy
*[FST]: Finite-State Transducer
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[SDK]: Software Development Kit
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[UDT]: User-Defined Trigger: enrolled wake words and command sets
*[VAD]: Voice Activity Detector
---
source_path: "models/types/lvcsr.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/lvcsr/"
---
# LVCSR _(TrulyNatural only)_
These recognizers use a phonetic acoustic model and an FST vocabulary decoder.
They are suitable for small to medium vocabulary tasks, but not for
unconstrained audio transcription.
These models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) and filenames that
by convention match `lvcsr-*.snsr`
You can create LVCSR recognizers with [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) or by
[specifying a grammar](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition) with build-capable[^1] model.
LVCSR recognizers include support for decoding with statistical [language models],
but Sensory does not distribute the tools used to create these[^2]. Language models can
provide improved accuracy for constrained target domains. _For transcription type
tasks, an STT model is a better fit._
The Sensory FST decoder supports hybrid models that contain both grammar-based and language model components.
**Also see these related items:** [LVCSR models](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-models) included in this distribution.
[^1]: LVCSR models created by [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) include build components only if the grammar references
at least one user-defined class, such as `~dynamic-1`. If the grammar contains no unresolved classes
VoiceHub removes the build components to reduce model files size and RAM use.
[^2]: Contact your [sales representative](https://doc.sensory.com/tnl/7.8/contact.md#sales) if you would like to explore using a custom language model
for your application.
## Operation
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
process
partial(^result-partial)
intent(^nlu-intent)
slot(^nlu-slot)
result(^result)
nlu{NLU match?}
start --> fetch
fetch --> audio
audio --> process
process --> fetch
process -->|hypothesis| partial
partial --> fetch
process -->|VAD endpoint or STREAM_END| nlu
nlu -->|yes| intent
nlu -->|no| result
intent --> slot
slot --> result
slot -->|more| intent
result --> fetch
```
Recognition flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition hypotheses
every [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval) ms.
5. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), or
an external [VAD](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) detects a speech endpoint.
6. If NLU is configured, invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each
top-level result that matches.
7. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis.
8. Resume processing from step 1.
**Note:**
LVCSR recognizers do **not** produce a final recognition hypothesis until they
run out of audio samples to process, or an external VAD detects a speech
endpoint.
With live audio you should use these with a VAD template such as
[tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr), or [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad-lvcsr).
## Settings
**Available events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** _none_
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream), [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream)
**Available configuration settings:** [ac-prune-top-k](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#ac-prune-top-k), [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [complete-only](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#complete-only), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [ram-limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#ram-limit), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [search.frame-nota](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#searchframe-nota), [show-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#show-silence)
**Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
## Notes
Sensory optimizes hybrid models with a background component only to detect speech that is not in
the specified grammar. These models report an [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) of `background` when they detect
out-of-grammar utterances. You should not use the out-of-grammar recognition [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text) result
as this will have a high word error rate. Consider using [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type) for transcription tasks instead.
[phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream) provides a convenient way to specify a recognition vocabulary from an exhaustive
list of alternative utterances.
## Grammar-based recognition
Sensory's LVCSR models use [grammars](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax) to constrain the possible utterances
they can recognize. Focussing on a limited set of words and structures defined in these grammars
improves recognition speed and accuracy at the expense of recognizing arbitrary input.
You can create a custom recognizer by specifying a fixed grammar during development if
the recognition vocabulary is entirely known, or at runtime if it is not. You can also
use a hybrid approach and build the invariant parts during development, and delay
adding [variable parts](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-classes) (such as a list of favorite TV channels) until runtime.
### Creating a recognizer
Create a grammar-based recognizer using the [command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools).
This example uses _data/grammars/enrollments.txt_ which contains a sample grammar specification for
the enrollment recordings in _data/enrollments/_.
To create a custom recognizer using this grammar with [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit),
specify an LVCSR model that supports building and [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream).
### Details: _data/grammars/enrollments.txt_
```
# LVCSR grammar specification for test utterances in data/enrollments/
#
# In a tpl-spot-vad-lvcsr pipeline the prefix would be consumed by the spotter.
prefix = armadillo | jackalope | terminator;
# List of known utterances in the *-c.wav files.
sentence =
18 percent of 643 |
call the nearest target |
how far away is winco |
play more songs by this artist |
record a video |
start a timer for 20 minutes |
i'm running low on gas |
cancel all my meetings on friday |
directions to susan's house |
do i have any new texts |
open my calendar to next week |
set an alarm for 6 am tomorrow;
# Match the prefix and zero or one of the sentences.
# and are sentence start and end markers that
# match silence and small amounts of extraneous speech.
g = $prefix $sentence? ;
```
```console
% cd $HOME/Sensory/TrulyNaturalSDK/7.9.0-pre.0
% bin/snsr-edit -vv -t model/lvcsr-build-enUS-14.0.2-5MB.snsr \
-f grammar-stream data/grammars/enrollments.txt \
-o lvcsr-enrollments.snsr
Loading "model/lvcsr-build-enUS-14.0.2-5MB.snsr" as the template model.
Loading "data/grammars/enrollments.txt" into setting "grammar-stream".
Saved edited model to "lvcsr-enrollments.snsr".
```
Run the new model with [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval):
```console
% bin/snsr-eval -t lvcsr-enrollments.snsr \
-s partial-result-interval=0 \ # (1)!
data/enrollments/armadillo-1-3-c.wav
165 2745 armadillo play more songs by this artist
```
1. [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval)` = 0` shows only the final recognition hypothesis.
For small grammars such as this the build time is negligible. [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval) can
build and run the recognizer in a single operation:
```console
% bin/snsr-eval -t model/lvcsr-build-enUS-14.0.2-5MB.snsr \
-f grammar-stream data/grammars/enrollments.txt \
-s partial-result-interval=0 \
data/enrollments/armadillo-1-3-c.wav
165 2745 armadillo play more songs by this artist
```
### Classes
A symbol that starts with the tilde `~` sigil specifies a [recognition class](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-syntax-class).
Class recognizers have their own grammar specifications, separate from the top-level
grammar. The behavior of a class-based recognizer is similar to that specified
by a rule. Classes, however, can be updated without recompiling the rest of the grammar,
and all references to a class use the same recognizer. This can reduce the recognizer size
and improve build speed.
This example uses a modified enrollment grammar which references two toy
classes: `~number` and `~place`:
**`enrollments-class.txt`**
```
# LVCSR grammar specification for test utterances in data/enrollments/
# This references two class sub-recognizers: ~number and ~place
#
# In a tpl-spot-vad-lvcsr pipeline the prefix would be consumed by the spotter.
prefix = armadillo | jackalope | terminator;
# List of known utterances in the *-c.wav files.
sentence =
~number percent of ~number |
call the nearest ~place |
how far away is ~place |
play more songs by this artist |
record a video |
start a timer for ~number minutes |
i'm running low on gas |
cancel all my meetings on friday |
directions to ~place |
do i have any new texts |
open my calendar to next week |
set an alarm for ~number am tomorrow;
# Match the prefix and zero or one of the sentences.
# and are sentence start and end markers that
# match silence and small amounts of extraneous speech.
g = $prefix $sentence? ;
```
**`place.txt`**
```
# Example place name class recognizer.
g = target | winco | susan's house;
```
The `~number` and `~place` classes referenced in _enrollments-class.txt_
create two new dynamic settings for these classes: `grammar-stream.number` and
`grammar-stream.place`. Specify these to create a complete recognizer:
```console
% snsr-edit -v -t model/lvcsr-build-enUS-14.0.2-5MB.snsr\
-f grammar-stream enrollments-class.txt \
-g grammar-stream.number "g = 18 | 643 | 20 | 6;" \ # (1)!
-o lvcsr-enrollments-class.snsr
Output written to "lvcsr-enrollments-class.snsr".
```
1. [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit)'s `-g` option sets the [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream)`.number` stream to a string argument. A file can also be used for the number grammar.
Run the recognizer:
```console
% snsr-eval -v -t lvcsr-enrollments-class.snsr \
-s partial-result-interval=0 \
data/enrollments/armadillo-1-0-c.wav
375 3150 (1.863e-08) armadillo 18 percent of 643
```
### Class libraries
TrulyNatural 6.15.0 introduced support for pre-built binary class repositories.
These contain classes built from frequently used grammar fragments such as dates, times, and numbers.
Load binary class repositories into the same [Session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) as an LVCSR model to add this capability to the model. If a grammar references a class that's not explicitly defined, the class name is looked up in the provided class library or libraries. System class libraries provided by Sensory use a prefix of `s.` for all class names.
See [lvcsr-lib-enUS-14.0.2.snsr](https://doc.sensory.com/tnl/7.8/models/index.md#lvcsr-lib-enUS) for a description of the classes used below.
**`class-lib.txt`**
```
# Example recognizer with classes from a class library
call = call {number ~s.phone-number};
emergency = ~s.call-emergency;
timer = {timer ~s.timer-phrases};
commands = {call} | {emergency} | $timer;
g = $commands ;
```
This example uses live audio, so it needs [snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval)'s `-a` flag
to add a [VAD](https://doc.sensory.com/tnl/7.8/models/tpl/tpl-vad-lvcsr.md#tpl-vad-lvcsr-type) to find the end of each utterance and signal
the recognizer to produce a final hypothesis.
```console
% snsr-eval -a -t model/lvcsr-build-enUS-14.0.2-5MB.snsr \
-t model/lvcsr-lib-enUS-14.0.2.snsr \
-f grammar-stream class-lib.txt \
-s partial-result-interval=0
# Say: Call 1 800 555 1212
NLU intent: call (0) = call one eight hundred five five five one two one two
NLU entity: number (0) = one eight hundred five five five one two one two
3360 6855 call one eight hundred five five five one two one two
# Say: Set a timer for 31 minutes.
NLU intent: timer (0) = set a timer for thirty one minutes
14610 16770 set a timer for thirty one minutes
# Say: Call the fire department.
NLU intent: emergency (0) = call the fire department
24540 25890 call the fire department
```
**C/C++**
Configuring class-based recognition with the C API:
```c
SnsrSession s;
snsrNew(&s);
snsrLoad(s, snsrStreamFromFileName("model/tpl-vad-lvcsr-3.17.0.snsr", "r"));
snsrSetStream(s, SNSR_SLOT_0,
snsrStreamFromFileName("model/lvcsr-build-enUS-14.0.2-5MB.snsr", "r"));
snsrLoad(s, snsrStreamFromFileName("model/lvcsr-lib-enUS-14.0.2.snsr", "r"));
snsrSetStream(s, SNSR_GRAMMAR_STREAM,
snsrStreamFromFileName("class-lib.txt", "r"));
if (snsrRC(s) != SNSR_RC_OK) {
fprintf(stderr, "ERROR: %s\n", snsrErrorDetail(s));
return snsrRC(s);
}
```
**Java**
Configuring class-based recognition with the Java API:
```java
SnsrSession s = new SnsrSession();
try {
s.load(SnsrStream.fromFileName("model/tpl-vad-lvcsr-3.17.0.snsr", "r"));
s.setStream(Snsr.SLOT_0,
SnsrStream.fromFileName("model/lvcsr-build-enUS-14.0.2-5MB.snsr", "r"));
s.load(SnsrStream.fromFileName("model/lvcsr-lib-enUS-14.0.2.snsr", "r"));
s.setStream(Snsr.GRAMMAR_STREAM,
SnsrStream.fromFileName("class-lib.txt", "r"));
} catch (IOException e) {
e.printStackTrace();
return s.rC();
}
```
**Python**
Configuring class-based recognition with the Python API:
```python
try:
with snsr.Session() as s:
s.load("model/tpl-vad-lvcsr-3.17.0.snsr")
s.set_stream(
snsr.SLOT_0,
snsr.Stream.from_filename("model/lvcsr-build-enUS-14.0.2-5MB.snsr", "r"),
)
s.load("model/lvcsr-lib-enUS-14.0.2.snsr")
s.set_stream(
snsr.GRAMMAR_STREAM,
snsr.Stream.from_filename("class-lib.txt", "r"),
)
except snsr.Error as e:
print(f"ERROR: {e.message}")
```
### Syntax
A [context-free grammar] is a set of rules that describes the sequences
of words that an LVCSR model can recognize.
#### Definition
1. Grammars use [UTF-8][] encoding.
1. `#` marks the start of a comment, which extends to the end of the line.
1. A _grammar_ is a series of _rules_ representing variable definitions.
The final rule in a grammar specifies the recognition vocabulary and typically
references rules defined earlier. It should include the sentence start (``)
and end (``) markers.
1. A _rule_ is an assignment of the form `name = expr ;` where
`name` is a _symbol_ and `expr` is a sequence of _symbols_ and _operators_.
`expr` is a type of [regular expression][].
1. A _symbol_ is a sequence of characters that does not include any whitespace
or operators, optionally prefixed by sigils `$` or `~`. A symbol without a
sigil is called a _terminal_ and is part of the recognition vocabulary,
for example `temperature`. [Special symbols](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-special) are predefined
terminals that describe input characteristics such as pauses and the edges of an utterance.
1. The `$` sigil does rule substitution _at build time_. The parser substitutes the
value of the rule named `name` for `$name`. Substitutions include
an implicit _grouping_ operator: Grammar `a = 1 | 2 | 3; b = $a ;`
is equivalent to `b = (1 | 2 | 3) ;`.
1. The `~` sigil substitutes a named recognition class _at runtime_.
- Each class is a recognizer with its own grammar, separate from the main grammar.
- All references to a class use instances of the same class recognizer.
- You can update each class in isolation, without having to recompile the main grammar.
- If you have a large rule that's referenced multiple times, converting it to a
class can speed up build time significantly.
- Use classes to augment a recognition vocabulary at runtime. In a voice dialing
application, for example, one would define the entire recognition grammar
at build time but use `~contacts` instead of a predefined list of contact names.
Once loaded, the application would scan the address book and build only the
`~contacts` class.
- Specify class definitions with [grammar-stream.classname](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream) or [phrases-stream.classname](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream), for example `phrases-stream.contacts`.
2. Operators include _grouping_ parentheses, brackets, and braces, _infix_ operators that indicate
logical AND and OR between symbols, and _postfix_ operators that change how the preceding symbol matches input.
The [operator precedence](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-op-precedence) table lists the order and direction in which the parser applies
operators.
3. Grouping
- `( )` Parentheses enclose items that are grouped together.
- `[ ]` Square brackets enclose optional items.
`[...]` is equivalent to `(...)?`.
- `{ }` Braces implement slot-capturing lightweight NLU markup.
- `{slotName a b c}` makes `a b c` available as the [nlu-slot-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-value) of
[nlu-slot-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-slot-name) `slotName` when the recognizer matches `a b c` to the input
audio.
- You can nest NLU slots to an arbitrary depth.
- The outermost slots are defined as [intents](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name) and all the nested
slots in each intent as [entities](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name).
- Each identified intent invokes handlers registered for [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent)
and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot).
- `{rule}` is shorthand for `{rule $rule}`.
- With this grammar:
```
seconds = 1 | 2 | 4 | 8 | half:0.5 a:? | a:? quarter:0.25 [of: a:];
shutterSpeed = set shutter speed to {seconds} ( second | seconds );
cmd = {shutterSpeed} ;
```
an utterance of "set shutter speed to a quarter of a second" will produce
`set shutter speed to 0.25 second`
as recognition output,
with an additional [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) callback for the top-level
`shutterSpeed` slot:
```
NLU intent: shutterSpeed (0) = set shutter speed to 0.25 second
NLU entity: seconds (0) = 0.25
```
4. Infix operators
- These are valid between symbols and may be surrounded by whitespace.
- `^` is the conjunction operator and is implied between adjacent terminals:
Grammar `g = one two three;` will recognize only the sequence "one two three".
- `|` is the disjunctive operator. It separates alternative items.
Grammar `g = one | two | three;` will recognize "one", or "two", or "three".
5. Postfix operators
- These directly follow a symbol without any intervening whitespace.
- `?` A question mark following a symbol makes that symbol optional:
It requires zero or one repetitions of the symbol.
- `+` A plus sign following a symbol or a group requires one or more repetitions of it.
- `*` An asterisk following a symbol or a group requires zero or more repetitions.
- `:` is the rewrite operator.
- `left:right` recognizes symbol `left` but produces terminal `right` as a recognition result.
- `left:` recognizes symbol `left` but rewrites that to an empty string, eliding
`left` from the recognition result.
- `:right` inserts `right` into the recognition result.
If you say "one two three", grammar
`g = one :mississippi two :mississippi three ;` produces
"one mississippi two mississippi three".
- `/` A forward slash following a symbol followed by a floating point number defines
a weight to be associated with that symbol. If there's a rewrite operator (`:`)
the slash must follow the rewritten-to terminal, for example: `one:een/0.123`
Weights are in the logprob domain, convert from a $[0, 1]$ probability to
a weight with $w = -log_{10}(p)$.
The default symbol weight is `0` for a probability of `1.0`.
6. `\` escape symbol. To include a literal special character in a grammar specification, escape it with a backslash. The list of characters that support this include:
`^`, `|`, `*`, `+`, `?`, `=`, `[ ]`, `( )`, `;`, `#`, and `:`.
**Also see these related items:** [grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#grammar-stream), [phrases-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#phrases-stream), [nlu-grammar-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#nlu-grammar-stream), [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot)
#### Operator precedence
The following table lists the precedence and associativity of grammar
operators. Operators are listed in descending precedence: level `0`
is applied first and level `5` last.
Precedence | Operator | Description | Associativity
:---------:|:--------:|-------------|--------------
0 | `:` | Rewrite output
0 | `/` | Symbol weight
1 | `( )` | Grouping
1 | `[ ]` | Optional group
1 | `{ }` | Slot-capturing semantic markup
2 | `?` | Zero-or-one symbol | left-to-right
2 | `+` | One-or-more symbols | left-to-right
2 | `*` | Zero-or-more symbols | left-to-right
3 | `^` | And, implied between symbols | right-to-left
4 | `|` | Alternative | right-to-left
5 | `=` | Rule assignment | right-to-left
This grammar:
```
a = one | two three four;
g = ( $a | five six) ;
```
will recognize only these phrases:
```
one
two three four
five six
```
#### Special symbols
A grammar can include these special symbols:
- `` - The silence at the start of a sentence.
- `` - The silence at the end of a sentence.
- `` - Short pauses between words. The grammar compiler automatically
adds these where needed, so there is no need to do so explicitly.
Do **not** add `` to NLU grammars, use `` instead.
- `` - A explicit short pause.
- `` - Matches when none of the alternatives are likely
(i.e. "none of the above").
+ Recognition results at the phrase level can include `` even
if this symbol was not explicitly used in the grammar. This is an
indication that the result was rejected due to [search.frame-nota](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#searchframe-nota), or that
RAM or CPU constraints limited the recognizer's ability to produce a result.
- `` - Similar to ``. In *some* models the
threshold for determining whether this symbol matches better than any
other is different from that of ``.
- `.` - When used with lightweight [NLU grammars](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#nlu-grammar-stream)
a single period matches any input word.
Use `.:*` to match any input words and remove them from the NLU result.
[regular expression]: https://en.wikipedia.org/wiki/Regular_expression
[UTF-8]: https://en.wikipedia.org/wiki/UTF-8
*[API]: Application Programming Interface
*[FST]: Finite-State Transducer
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[RAM]: Random Access Memory
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/types/stt.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/stt/"
---
# Speech To Text _(STT only)_
These models do audio transcription with transformers.
STT models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr) and filenames that
by convention match `stt-*.snsr`
**Also see these related items:** [STT models](https://doc.sensory.com/tnl/7.8/models/index.md#stt-models) included in this distribution.
## Operation
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
process[process]
partial(^result-partial)
intent(^nlu-intent)
slot(^nlu-slot)
result(^result)
nlu{NLU match?}
slm{SLM included?}
generate[generate]
slmstart(^slm-start)
slmresultpartial(^slm-result-partial)
slmresult(^slm-result)
start --> fetch
fetch --> audio
audio --> process
process --> fetch
process -->|hypothesis| partial
partial --> fetch
process -->|VAD endpoint or STREAM_END| nlu
nlu -->|yes| intent
nlu -->|no| result
intent --> slot
slot --> result
slot -->|more| intent
result --> slm
slm -->|yes| slmstart
slm -->|no| fetch
slmstart -->|OK| generate
slmstart -->|STOP| fetch
generate -->|response| slmresultpartial
slmresultpartial --> generate
generate -->|done| slmresult
slmresult --> fetch
```
Recognition flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Invoke [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial) with interim recognition hypotheses
every [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval) ms.
5. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok), or
an external [VAD](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) detects a speech endpoint.
6. If NLU is configured, invoke [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent) and [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot) for each
top-level result that matches.
7. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) with the final recognition hypothesis.
8. If an SLM is not available, resume processing at step 1.
9. Invoke [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start). If the handler returns [STOP](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stop),
resume processing at step 1.
10. Invoke [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial) as the model generates text.
11. Invoke [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result) when text generation is complete.
12. Resume processing at step 1.
**Note:**
STT recognizers do **not** produce a final recognition hypothesis until they
run out of audio samples to process, or an external VAD detects a speech
endpoint.
With live audio you should use these with a VAD template such as
[tpl-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-vad-lvcsr), [tpl-opt-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-opt-spot-vad-lvcsr), or [tpl-spot-vad-lvcsr](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-vad-lvcsr).
## Settings
**Available events:** [^nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent), [^nlu-slot](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-slot), [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result-partial), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^slm-result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result), [^slm-result-partial](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-result-partial), [^slm-start](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#slm-start)
**Available iterators:** _none_
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [custom-vocab](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#custom-vocab), [partial-result-interval](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#partial-result-interval), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [stt-profile](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#stt-profile)
**Available values:** [lvcsr](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#lvcsr)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
*[API]: Application Programming Interface
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[SLM]: Generative Small Language Model
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/types/vad.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/vad/"
---
# VAD
Models of this type find speech segments in audio data streams.
Wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad) and filenames that
by convention match `vad-*.snsr`
**Also see these related items:** [VAD models](https://doc.sensory.com/tnl/7.8/models/index.md#vad-models) included in this distribution.
## Operation
```mermaid
flowchart TD
start((start))
fetch0[/samples from ->audio-pcm/]
fetch1[/samples from ->audio-pcm/]
audio0(^sample-count)
audio1(^sample-count)
silence(^silence)
begin(^begin)
END(^end)
limit(^limit)
process0[process]
process1[process]
out[\samples to <-audio-pcm\]
final@{ shape: f-circ }
start --> fetch0
fetch0 --> audio0
audio0 --> process0
process0 --> fetch0
process0 -->|speech start| begin
process0 -->|timeout| silence
silence --> final
begin --> fetch1
fetch1 --> audio1
audio1 --> out
out --> process1
process1 --> fetch1
process1 -->|speech end| END
process1 -->|speech limit| limit
END --> final
limit --> final
final --> fetch0
```
Endpointing flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. If speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms continue at step 6.
4. If _no_ speech detected within [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence) ms, invoke [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence)
and restart from step 1.
5. Continue processing at step 1 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).
6. Invoke [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin).
7. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
8. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
9. If [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through) `== 1` write speech samples to [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out).
10. If end detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end)
and restart from step 1.
11. If end _not_ detected within [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording) ms, invoke [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit)
and restart from step 1.
12. Continue processing at step 7 until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^begin](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#begin), [^end](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#end), [^limit](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#limit), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event), [^silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#silence)
**Available iterators:** _none_
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [<-audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-pcm-out), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [skip-to-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-ms), [skip-to-sample](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#skip-to-sample)
**Available configuration settings:** [backoff](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#backoff), [hold-over](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#hold-over), [include-leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#include-leading-silence), [leading-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#leading-silence), [max-recording](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#max-recording), [pass-through](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#pass-through), [trailing-silence](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#trailing-silence)
**Available values:** [vad](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad)
**Also see these related items:** [live-segment.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-segment.md#live-segment-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
*[API]: Application Programming Interface
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "models/types/wake-word.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/models/types/wake-word/"
---
# Wake word
Fixed and enrolled wake words, and command sets.
Wake word models have [task-type](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type)` == `[phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot) and filenames that
by convention match `spot-*.snsr`
You can create custom wake words and command sets with [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) or
[wake word enrollment](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type).
**Also see these related items:** [Wake word models](https://doc.sensory.com/tnl/7.8/models/index.md#wake-word-models) included in this distribution.
## Operation
```mermaid
flowchart TD
start((start))
fetch[/samples from ->audio-pcm/]
audio(^sample-count)
process[process]
result(^result)
start --> fetch
fetch --> audio
audio --> process
process --> fetch
process -->|recognize| result
result --> fetch
```
Recognition flow.
1. Read audio data from [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm).
2. Invoke [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event).
3. Invoke [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result) if processing detects a vocabulary phrase.
4. Continue processing until [STREAM_END](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_stream_end) occurs on [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm),
or one of the event handlers returns a code other than [OK](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_ok).
Register callback handlers with [setHandler](https://doc.sensory.com/tnl/7.8/api/inference.md#sethandler) only for those events you're interested in.
## Settings
**Available events:** [^result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result), [^sample-count](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#sample-count-event)
**Available iterators:** [operating-point-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#operating-point-iterator), [vocab-iterator](https://doc.sensory.com/tnl/7.8/api/setting-keys/iterators.md#vocab-iterator)
**Available results:** [audio-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream), [audio-stream-first](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-first), [audio-stream-last](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#audio-stream-last)
**Available runtime settings:** [->audio-pcm](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#-audio-pcm), [audio-stream-from](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-from), [audio-stream-to](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#audio-stream-to), [dsp-acmodel-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-acmodel-stream), [dsp-header-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-header-stream), [dsp-search-stream](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-search-stream)
**Available configuration settings:** [audio-stream-size](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#audio-stream-size), [delay](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#delay), [dsp-target](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#dsp-target), [duration-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#duration-ms), [listen-window](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#listen-window), [low-fr-operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#low-fr-operating-point), [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point), [samples-per-second](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#samples-per-second), [sv-threshold](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#sv-threshold)
**Available values:** [phrasespot](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#phrasespot)
**Also see these related items:** [live-spot.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-spot.md#live-spot-code), [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-eval-code), [PhraseSpot.java](https://doc.sensory.com/tnl/7.8/api/sample/android/enroll-trigger.md#et-code), [segmentSpottedAudio.java](https://doc.sensory.com/tnl/7.8/api/sample/java/segmentSpottedAudio.md#segmentspottedaudio-code)
*[API]: Application Programming Interface
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "reference/index.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/reference/"
---
# Reference
This section covers the TrulyNatural SDK product — SDK variants and
supported platforms, command-line tools, the supplied models and model
types, licensing, and the changelog. For the programming interfaces, see
the [API reference](https://doc.sensory.com/tnl/7.8/api/index.md#api-reference).
[Overview](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-overview)
- **Start here.** SDK variants, development host requirements, supported target platforms, models, tools, and license keys.
[Command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools)
- Utilities for running and constructing models.
[Models](https://doc.sensory.com/tnl/7.8/models/index.md#models)
- Sample models included in this distribution.
[Model types](https://doc.sensory.com/tnl/7.8/models/types/index.md#model-types)
- Descriptions of various model types and their behaviors.
[Licenses](https://doc.sensory.com/tnl/7.8/licenses/index.md#sensory-sdk-license)
- Sensory and third-party legal agreements.
[Changelog](https://doc.sensory.com/tnl/7.8/changes/index.md#v7-changes)
- Changes by TrulyNatural SDK version.
[How to upgrade](https://doc.sensory.com/tnl/7.8/upgrade.md#how-to-upgrade)
- Change to a different SDK type or upgrade to a newer version.
[VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub)
- Guide to selecting the appropriate format for models created with
[Sensory's VoiceHub portal][vh].
[Contact information](https://doc.sensory.com/tnl/7.8/contact.md#contact)
- How to get in touch with Sensory.
[vh]: https://www.sensory.com/voicehub/ "Create a custom voice recognizer quickly and easily"
*[API]: Application Programming Interface
*[SDK]: Software Development Kit
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "reference/overview.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/reference/overview/"
---
# Overview
This section provides a brief overview of this SDK: Features supported by [variant](https://doc.sensory.com/tnl/7.8/reference/overview.md#variants),
development host [requirements](https://doc.sensory.com/tnl/7.8/reference/overview.md#requirements), [supported target platforms](https://doc.sensory.com/tnl/7.8/reference/overview.md#supported-target-platforms), `snsr` [model](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-models) files,
command-line [tools](https://doc.sensory.com/tnl/7.8/reference/overview.md#ref-tools), and the software [license keys](https://doc.sensory.com/tnl/7.8/reference/overview.md#license-keys) used to control library features.
## Variants
The TrulyHandsfree, TrulyNatural (Lite), and TrulyNatural STT SDKs differ **only** in
the types of models they support. The APIs, model formats, tools, etc. are identical.
TrulyNatural STT is a strict superset of TrulyNatural (Lite), which in turn
is a strict superset of TrulyHandsfree.
**[TrulyNatural STT][tnl-stt]:**
* [x] Speech-To-Text with transformers and compressed language models.
* [x] Recognition hypotheses include punctuation and capitalization.
* [x] Machine-learned NLU for intent and entity identification.
* [x] Generative language models.
* [x] **Sensory has models available for 35 languages**,
each in multiple sizes (for best accuracy given a CPU cycle budget).
Contact your account representative or [Sensory Sales](https://doc.sensory.com/tnl/7.8/contact.md#sales) for
details.
* [x] _Includes [Open Source software](https://doc.sensory.com/tnl/7.8/licenses/oss.md#open-source-licenses)._
* [x] Features available in TrulyNatural STT only are flagged with _(STT only)_
**[TrulyNatural Lite][tnl-lite]:**
* [x] Phonemic acoustic models with FST vocabulary decoding.
* [x] [Grammar-based](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#grammar-based-recognition) medium vocabulary command and control.
* [x] Grammar-based NLU for intent and entity identification.
* [x] Tools to build recognizers from grammars or phrase lists.
* [x] API to build or augment recognizers at runtime.
* [x] Runs [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) "large natural language vocabulary" models.
* [x] Support for devices with limited RAM (< 1 MiB) and CPU (< 500 MHz).
* [x] _No third-party or Open Source software._
* [x] Features available in TrulyNatural (Lite and STT) only are flagged with _(TrulyNatural only)_
**[TrulyHandsfree][thf]:**
* [x] [Fixed](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), [enrolled](https://doc.sensory.com/tnl/7.8/models/types/enroll.md#enroll-type) and [adapting](https://doc.sensory.com/tnl/7.8/models/types/ca.md#ca-type) wake words.
* [x] Command sets, which are keyword spotter recognizers for multiple (up to twenty) active phrases.
* [x] [VAD](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#vad).
* [x] [Command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools) to enroll and evaluate wake word models, and to convert wake word models
into Sensory's [THF Micro][] DSP format.
* [x] Runs [VoiceHub](https://doc.sensory.com/tnl/7.8/reference/voicehub.md#voicehub) "wake word" and "simple commands" projects.
* [x] _No third-party or Open Source software._
## Requirements
For development, you'll need:
- macOS, x86_64 Linux, or
Windows (version 10 or later, [Microsoft Visual Studio][msvc] 2022)
development machine.
- iOS: [Xcode] 26.5 or later.
- Java: [Java JDK][jdk] 11 through 21.
- Android: [Android Studio Panda][as] 2025.3.4 or later.
[API level 21][api-levels] or later.
**Verified with:**
TrulyNatural SDK 7.9.0-pre.0 was verified against
**Xcode 26.5** and **Android Studio Panda 4 | 2025.3.4 Patch 1**.
Newer point releases are expected to work but are not part of the
release-test matrix.
Models require audio encoded as 16-bit LPCM and sampled at 16 kHz.
For optimal recognition accuracy, ensure that the dynamic range of the input audio
spans **at least** 12 bits (-24 [dBFS][] peak-to-peak, sample values
from -2048 to 2047) and that no clipping is present.
## Supported target platforms
TrulyHandsfree and TrulyNatural run on hundreds of different operating systems
and CPU combinations. This distribution includes a subset of these in
_~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/_. See the `README` files in the platform subdirectories
for additional details, such as the toolchain and compiler flags used to build the library.
TrulyNatural STT is available for Android, iOS on `arm64` and `arm64e`, macOS, Linux on `x86_64`, `aarch64`, and `arm`,
and Windows on `x86_64`.
[Contact](https://doc.sensory.com/tnl/7.8/contact.md#contact) Sensory if your target platform isn't listed.
Platform { data-sort-default }| STT support| Note
:-----------------------------|:----------:|:----
`aarch64-linux-gnu` | • yes | [GLIBC][] >= 2.33
`arm-linux-gnueabi` | • no | [GLIBC][] >= 2.17
`arm-linux-gnueabihf` | • yes | [GLIBC][] >= 2.33
`arm-none-eabi` | • no
`arm-none-eabihf` | • no
`arm-none-eabihf-ethosu` | • no
`armv6-linux-gnueabihf` | • no | [GLIBC][] >= 2.17
`i686-linux-gnu` | • no | [GLIBC][] >= 2.17
`ios` | • yes | 64-bit only
`android` | • yes | [API level][api-levels] >= 21
`macos` | • yes
`mipsel-buildroot-linux-uclibc` | • no
`mipsel-openwrt-linux-musl` | • no
`x86_64-linux-gnu` | • yes | [GLIBC][] >= 2.17
`x86_64-windows-msvc` | • yes | Requires [MSVC Runtime][] 2022
*Included target platform libraries*
## Models
TrulyNatural SDK `.snsr` files include all the models and settings required for a task, and a flow
graph that defines the behavior. A task can be as simple as a single-phrase [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type),
or something more complicated such as wake word followed by a VAD and an STT recognizer that transcribes
the detected speech segment. If you're just interested in the final recognition results, the code required
to run these two examples is identical.
This distribution includes sample [models](https://doc.sensory.com/tnl/7.8/models/index.md#models) and [templates](https://doc.sensory.com/tnl/7.8/models/index.md#templates) used to add additional behaviors to these.
## Tools
The _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/bin/_ directory contains a number of [command-line tools](https://doc.sensory.com/tnl/7.8/tools/index.md#command-line-tools). These evaluate models,
compose new models, modify settings, enroll wake words, convert wake word models to [THF Micro][] DSP format,
and diagnose audio recording quality.
These utilities are compiled for the development host. You can [compile these from source](https://doc.sensory.com/tnl/7.8/api/sample/c/index.md#c-examples) for other
platforms.
## License keys
The TrulyNatural SDK installer embeds the license key entered on the
"Product Licensing" page in the libraries and tools it installs.
All applications that link against these libraries include this license key.
Keys include the SDK licensee name.
License keys control access to specific SDK features,
target platforms, CPU architectures, and to specify an expiration date
for access.
Model files also include license keys. These are validated upon loading.
License keys fall into two broad categories: _development_ ones which
either expire at some future date or limit use, and
_production_ keys which do not expire and do not have usage limits.
You can use the [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license) option
to apply an updated license key with the [configuration](https://doc.sensory.com/tnl/7.8/api/library-config.md#config) API
at runtime.
**Warning:**
Do _not_ use development / expiring keys in shipping products. These will
stop working when the keys expire.
[Contact](https://doc.sensory.com/tnl/7.8/contact.md#contact) Sensory to obtain production-ready libraries and models.
**Also see these related items:** [license-exp-date](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-date), [license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-message), [license-exp-warn](https://doc.sensory.com/tnl/7.8/api/setting-keys/library-information.md#license-exp-warn), [model-license-exp-date](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-date), [model-license-exp-message](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-message), [model-license-exp-warn](https://doc.sensory.com/tnl/7.8/api/setting-keys/runtime.md#model-license-exp-warn), [LICENSE](https://doc.sensory.com/tnl/7.8/api/library-config.md#config_license), [LICENSE_NOT_VALID](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [LICENSE_LIMIT_EXCEEDED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc), [LICENSE_OVERRIDE_NOT_SUPPORTED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_supported), [LICENSE_OVERRIDE_NOT_ENABLED](https://doc.sensory.com/tnl/7.8/api/inference.md#rc_license_override_not_enabled)
[api-levels]: https://en.wikipedia.org/wiki/Android_version_history "Android version history and API levels"
[as]: https://developer.android.com/studio/index.html "Android Studio"
[dBFS]: https://en.wikipedia.org/wiki/DBFS "Decibels relative to full scale"
[GLIBC]: https://sourceware.org/glibc/wiki/Glibc%20Timeline "GNU C Library Release Timeline"
[jdk]: https://adoptium.net "Java Development Kit"
[msvc]: https://visualstudio.com/ "Microsoft Visual Studio"
[MSVC Runtime]: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#visual-studio-2015-2017-2019-and-2022 "Microsoft Visual C++ Redistributable"
[thf]: https://www.sensory.com/wake-word/ "Low Power Wake Words & Phrase Recognition Engine"
[THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation"
[tnl-lite]: https://www.sensory.com/natural-language-understanding/ "Large Vocabulary Continuous Speech Recognition (LVCSR) with Dynamic Natural Language Understanding"
[tnl-stt]: https://www.sensory.com/embedded-speech-to-text/ "Embedded Speech To Text"
*[API]: Application Programming Interface
*[FST]: Finite-State Transducer
*[LPCM]: Linear pulse-code modulation
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[NLU]: Natural Language Understanding model
*[OSS]: Open-source software
*[RAM]: Random Access Memory
*[SDK]: Software Development Kit
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[VAD]: Voice Activity Detector
---
source_path: "reference/voicehub.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/reference/voicehub/"
---
# VoiceHub
[Sensory's VoiceHub][vh] is a web portal that provides a convenient interface for
developers to prototype and experiment with wake words, language models and natural
language understanding. Users can build custom wake words, voice control command sets, and
create grammar-based language models with flexible intents and entities.
VoiceHub uses Sensory's [TrulyHandsfree][thf] for wake words and spotted commands,
and [TrulyNatural][tnl-lite] for grammar-based recognition with natural language
markup to identify intents and entities.
## Output format selection
VoiceHub can deliver recognizer models in various formats, as specified by the `Output Format`
selector. If you want to use such a model with the TrulyHandsfree or TrulyNatural SDKs
you should select the `THF/TNL SDK: snsr file` option. This is the default for new projects.
If you are using TrulyHandsfree or TrulyNatural on a small embedded platform,
you should select one of the alternate output formats described below.
**`THF/TNL SDK: snsr file`** _(recommended)_
- This is the standard TrulyNatural model format. Use this unless you will be running the model
on an embedded platform with limited CPU cycles and available RAM.
**`THF/TNL SDK: snsr file (low memory use)`**
- This optimizes the model for small platforms with limited CPU cycles (< 500 MHz)
and RAM (< 1 MiB of heap). The reduced heap and CPU requirements come at the expense
of a bit of recognition accuracy.
**`THF/TNL SDK: .c file (low memory use)`**
- Similar to `THF/TNL SDK: snsr file (low memory use)` above, but also includes a model converted
to C code that you can compile into your application. On platforms with read-only / flash
memory this reduces the amount of RAM required by the size of the model file.
- VoiceHub uses [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) to create two C files from the `snsr` model:
- `snsr-edit -c voicehub -t model.snsr` to create _model.c_, and
- `snsr-edit -i -t model.snsr -o model-custom-init.c` to create _model-custom-init.c_. This file
includes custom initialization code that elides unused modules at link time to [reduce overall
application size](https://doc.sensory.com/tnl/7.8/faq.md#reduce-code-size).
**`THF/TNL SDK: .c file (low memory use for ST Micro STM32H7)`**
- Similar to `THF/TNL SDK: .c file (low memory use)`, but also includes TrulyNatural SDK libraries
for use on the STMicroelectronics [STM32H7][] series microcontrollers.
**`Embedded: Arm Cortex-M55/M85 Ethos-U55-128`**
- Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's
Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-128`.
- Use this on the [Alif Ensemble][] family of microcontrollers.
**`Embedded: Arm Cortex-M55/M85 Ethos-U55-256`**
- Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's
Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-256`.
- Use this on the [Alif Ensemble][] family of microcontrollers.
**`Embedded: Infineon Arm Cortex-M55/M85 Ethos-U55-128 (model in RAM)`**
- Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's
Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-128`.
- The compiled model code runs from RAM. This requires more available heap than the
`(model in ROM/Flash)` option below. Use this only if the flash read speed is too low
to allow the model to run in real time.
- Use this only on Infineon microcontrollers.
**`Embedded: Infineon Arm Cortex-M55/M85 Ethos-U55-128 (model in ROM/Flash)`**
- Similar to `THF/TNL SDK: .c file (low memory use)`, but with inference optimized for Arm's
Cortex-M55/M85 and [Ethos-U55][] NPU with [Vela][] accelerator config `ethos-u55-128`.
- The compiled model code runs from code space.
- Use this only on Infineon microcontrollers.
[Alif Ensemble]: https://alifsemi.com/products/ensemble/ "The Alif Ensemble family of Arm-based 32-bit microcontrollers"
[Ethos-U55]: https://developer.arm.com/Processors/Ethos-U55 "Arm Ethos-U NPU family"
[STM32H7]: https://www.st.com/en/microcontrollers-microprocessors/stm32h7-series.html
[thf]: https://www.sensory.com/wake-word/ "Low Power Wake Words & Phrase Recognition Engine"
[tnl-lite]: https://www.sensory.com/natural-language-understanding/ "Large Vocabulary Continuous Speech Recognition (LVCSR) with Dynamic Natural Language Understanding"
[Vela]: https://developer.arm.com/documentation/109267/0102/Tool-support-for-the-Arm-Ethos-U-NPU/Ethos-U-Vela-compiler "Ethos-U Vela compiler"
[vh]: https://www.sensory.com/voicehub/ "Create a custom voice recognizer quickly and easily"
*[RAM]: Random Access Memory
*[ROM]: Read-Only Memory, typically nonvolatile flash memory
*[SDK]: Software Development Kit
*[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/audio-check.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/audio-check/"
---
# audio-check
This tool runs checks on the audio for problems such as all-zero runs or clipping.
Also estimates signal-to-noise ratio.
The audio file should be a WAV file, mono, 16 KHz.
## Usage
```
Reports audio file quality.
usage: audio-check wavfile
options:
-v [-v [-v]] : increase verbosity
```
## Example
```console
% audio-check sampleAudio.wav
Clipping/Saturation:
No clipping / saturation - OK
Flat Waveform:
Problem: Flat for 1 msec.
Signal-to-Noise Ratio
Problem: low signal-to-noise ratio.
SNR estimate: 9.19 dBA (poor)
```
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/index.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/"
---
# Command-line tools
The TrulyNatural SDK includes a number of command-line utilities.
Find executables for the host platform in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/bin/_
## Tools
[snsr-eval](https://doc.sensory.com/tnl/7.8/tools/snsr-eval.md#snsr-eval)
- Evaluates / runs TrulyNatural SDK `.snsr` model files.
[snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit)
- Edits/modifies TrulyNatural SDK `.snsr` model files.
[spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll)
- Enrolls TrulyNatural SDK wake words on audio files.
[live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll)
- Enrolls TrulyNatural SDK wake words on live audio.
[snsr-eval-batch](https://doc.sensory.com/tnl/7.8/tools/snsr-eval-batch.md#snsr-eval-batch)
- Runs a TrulyNatural SDK `.snsr` model file on test data
and reports the false accept rate, false reject ratio, optional
word-error rate, and execution speed
[spot-convert](https://doc.sensory.com/tnl/7.8/tools/spot-convert.md#spot-convert)
- Converts TrulyNatural SDK wake word models to [THF Micro][] format.
[snsr-log-split](https://doc.sensory.com/tnl/7.8/tools/snsr-log-split.md#snsr-log-split)
- Splits spotter log files into an event log, captured audio data,
and the source spotter model.
[audio-check](https://doc.sensory.com/tnl/7.8/tools/audio-check.md#audio-check)
- Reports audio file quality.
[THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation"
*[SDK]: Software Development Kit
*[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/live-enroll.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/live-enroll/"
---
# live-enroll
Interactive command-line phrase spotter enrollment, using the default audio
capture device.
**Also see these related items:** [live-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/live-enroll.md#live-enrollc)
## Usage
```
Enrolls TrulyNatural SDK wake words on live audio.
usage: live-enroll -t task [options] +user1 [+user2 ...] [file ...]
options:
-e enrollments : enrollment context output filename
-o out : enrolled model output filename (default: enrolled-sv.snsr)
-p prefix : capture each enrollment to file as
--{pass,fail}-.wav
-s setting=value : override a task setting
-t task : specify task filename (required)
-v [-v [-v]] : increase verbosity
Enrollment audio is captured from the default microphone, unless
the optional [file ...] arguments are supplied.
Settings are strings used as keys to query or change task behavior.
Most frequently used for enrollment is accuracy, which takes a value
between 0 and 1.
Refer to the TrulyNatural SDK documentation for a complete list and
descriptions of all supported settings.
```
## Examples
Enroll two phrases interactively on a Raspberry Pi 3.
```console
% cd sample/c
% make -s -j4 all
% bin/live-enroll -v -t ../../model/udt-universal-3.67.1.0.snsr \
+hey-sensory +hello-voice-genie
Say the enrollment phrase (1/4) for "hey-sensory"
Recording: 3.46 s
Preliminary enrollment checks passed.
Say the enrollment phrase (2/4) for "hey-sensory"
Recording: 3.41 s
Preliminary enrollment checks passed.
Say the enrollment phrase (3/4) for "hey-sensory" with context,
for example: " will it rain tomorrow?"
Recording: 4.30 s
This enrollment recording is not usable.
Reason: silence-begin
Fix: Please wait for the prompt before speaking.
Say the enrollment phrase (3/4) for "hey-sensory" with context,
for example: " will it rain tomorrow?"
Recording: 4.44 s
Preliminary enrollment checks passed.
Say the enrollment phrase (4/4) for "hey-sensory" with context,
for example: " will it rain tomorrow?"
Recording: 4.18 s
Preliminary enrollment checks passed.
Say the enrollment phrase (1/4) for "hello-voice-genie"
Recording: 2.30 s
Preliminary enrollment checks passed.
Say the enrollment phrase (2/4) for "hello-voice-genie"
Recording: 3.53 s
Preliminary enrollment checks passed.
Say the enrollment phrase (3/4) for "hello-voice-genie" with context,
for example: " will it rain tomorrow?"
Recording: 4.22 s
Preliminary enrollment checks passed.
Say the enrollment phrase (4/4) for "hello-voice-genie" with context,
for example: " will it rain tomorrow?"
Recording: 5.59 s
Preliminary enrollment checks passed.
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Test:
```console
% bin/snsr-eval -v -t ./enrolled-sv.snsr
Using live audio from default capture device. ^C to stop.
1485 2175 (0.70) hey-sensory
4155 5085 (0.68) hello-voice-genie
7710 8685 (0.61) hello-voice-genie
10770 11535 (0.61) hey-sensory
^C
```
*[API]: Application Programming Interface
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/snsr-edit.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-edit/"
---
# snsr-edit
This tool edits default task settings, and composes specialized
tasks by filling template task slots with spotter models.
**Also see these related items:** [snsr-edit.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-edit.md#snsr-editc)
## Usage
```
Edits/modifies TrulyNatural SDK .snsr model files.
usage: snsr-edit -t task [options]
options:
-C tag-identifier : emit C source to load model into RAM
-c tag-identifier : emit C source to run model from code space
-e setting filename : extract task setting/slot into filename
-f setting filename : load filename into task setting/slot
-g setting value : load string into task setting
-i : emit custom initialization code
-o out : output filename
-p : prune unused settings to reduce model size
-q setting : query a task setting
-s setting=value : override a task setting
-t task : specify task filename (required)
-v [-v [-v]] : increase verbosity
Settings are strings used as keys to query or change task behavior.
Most frequently used are operating-point for wake words and command sets,
leading-silence and trailing-silence for VAD templates,
partial-result-interval for LVCSR and STT, and stt-profile for STT models.
Refer to the TrulyNatural SDK documentation for a complete list and
descriptions of all supported settings.
```
## Examples
Query and change the default [operating-point](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#operating-point). This creates a new file, _hbg-3.snsr_,
which is a copy of _spot-hbg-enUS-1.4.0-m.snsr_ with a less accepting default operating point.
See the [wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) task description for a list of valid setting names.
```console
% snsr-edit -t spot-hbg-enUS-1.4.0-m.snsr -q operating-point
operating-point = 10
% snsr-edit -t spot-hbg-enUS-1.4.0-m.snsr -s operating-point=3 -o hbg-3.snsr
% snsr-edit -t hbg-3.snsr -q operating-point
operating-point = 3
```
Create a new spotter task model that runs a fixed phrase spotter and an enrolled
spotter (user-defined or fixed-trigger) at the same time.
_tpl-spot-concurrent-1.5.0.snsr_ is a template with two slots,
named `0` and `1`. The combined model _fixed+udt.snsr_ is a standard
[wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type) task that spots the vocabulary from
_spot-hbg-enUS-1.4.0-m.snsr_ and _enrolled-sv-0.snsr_ at the same time.
```console
% snsr-edit -v -t tpl-spot-concurrent-1.5.0.snsr\
-f 0 spot-hbg-enUS-1.4.0-m.snsr\
-f 1 enrolled-sv-0.snsr -o fixed+udt.snsr
Saved edited model to "fixed+udt.snsr".
```
Convert a spotter model to C code.
```console
% snsr-edit -v -c voicegenie -t spot-voicegenie-enUS-6.5.1-m.snsr
Saved edited model to "spot-voicegenie-enUS-6.5.1-m.c".
% head -20 spot-voicegenie-enUS-6.5.1-m.c
```
Create custom TrulyNatural initialization code that limits code references to
those modules needed to run the _spot-hbg-enUS-1.4.0-m.snsr_
_spot-music-enUS-1.2.0-m.snsr_ models. Include
the generated _snsr-custom-init.c_ file in your application build, and
compile with `-DSNSR_USE_SUBSET`. This will reduce the application code size
by limiting its capabilities to run snsr models to just the models used
to create the custom initialization. See [Compile-time macros](https://doc.sensory.com/tnl/7.8/api/compile-macros.md#compile-time-macros).
```console
% snsr-edit -v -i -t spot-hbg-enUS-1.4.0-m.snsr -t spot-music-enUS-1.2.0-m.snsr
Output written to "snsr-custom-init.c".
```
*[API]: Application Programming Interface
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
*[UDT]: User-Defined Trigger: enrolled wake words and command sets
---
source_path: "tools/snsr-eval-batch.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-eval-batch/"
---
# snsr-eval-batch
This tool runs a [Wake word](https://doc.sensory.com/tnl/7.8/models/types/wake-word.md#wake-word-type), [LVCSR](https://doc.sensory.com/tnl/7.8/models/types/lvcsr.md#lvcsr-type) or [STT](https://doc.sensory.com/tnl/7.8/models/types/stt.md#stt-type)
model over a (typically large) number of audio files to measure the performance in
terms of the false accept (FA) rate, and the false reject (FR) ratio. Can also be
used to measure command substitution or word-error-rate (WER) in LVCSR and STT.
## Test data requirements
Audio files used for FR (in-vocabulary) testing:
- Must contain a single target phrase utterance per file.
- Must contain lead-in ambient audio before the target phrase begins.
+ In most cases one second of ambient audio will suffice.
+ For custom spotters, refer to the documentation delivered with the
model for the exact requirements.
+ Most models created after May 2020 include a setting,
`min-in-vocab-duration`, which specifies the minimum required lead-in
time in milliseconds.
You can query this with `snsr-edit -t model.snsr -q min-in-vocab-duration`
+ Recognition events that happen during the required lead-in time are
counted as errors. See `INVFA` in the log file format table below.
+ You can override the minimum lead-in requirement on the command-line
(with `-s min-in-vocab-duration=0`), but doing
so means you will be testing the model outside of its intended
operating environment.
- The FR ratio is calculated as the fraction of the in-vocabulary files
that the spotter model did not find the phrase in, expressed as a
percentage. Example: Out of 2000 files, 120 did not trigger the spotter.
The false-reject ratio is therefore 6.0%.
- If reference phrase checking is used, then mismatches will be noted as
substitutions (SB code) and be included in the FR count and ratio.
- If word-error-rate is used, then the total words, substitutions, additions
and deletions in each phrase will be noted. The total count for each
across the entire test set will be reported also.
Audio files used for FA testing:
- Should be much longer than the in-vocabulary examples.
- Should contain a selection of noise expected to be encountered during
regular use.
- Must not contain explicit instances of the target phrase.
- The FA rate is calculated as the average number of times the spotter
model mistakenly triggered per hour.
Example: Out of 120 hours of audio, the spotter triggered 60 times.
The false-accept rate is 0.5 / hour.
- If you run `snsr-eval-batch` with the `-u` flag, unexpected recognition
events from the FR testing files are included in the false accept totals.
These unexpected events include:
+ Spots that happen during the required lead-in period.
+ The second and all subsequent spots, as each in-vocabulary file
must contain only a single target phrase utterance.
- FA testing can only be done on wake words. Commands, LVCSR and
STT are not continuous listening technologies and FA testing is not
relevant here.
## Usage
```
Runs a TrulyNatural SDK wake word model file on test data
and reports the false accept rate, false reject ratio, and execution speed.
usage: snsr-eval-batch -t task [options]
options:
-a : Add tpl-vad-lvcsr to LVCSR and STT models
-c filename : csv in-vocabulary (FR) and reference filename list
-f setting filename : load filename into task setting
-h : show this help and exit
-i filename : in-vocabulary (FR) filename list
-j threads : number of concurrent jobs (default: 1)
-l filename : log output file (default: .log)
-n : normalize results (lower case, strip punctuation)
-o filename : out-of-vocabulary (FA) filename list
-s setting=value : override a task setting
-t task : specify task filename (required)
-u : count in-vocabulary FAs
-v [-v [-v]] : increase verbosity
-w : calculate word-error rate on in-vocabulary audio
At least one of -i, -c, or -o is required.
-c and -i cannot be used together.
-c file format is two comma-separated filespecs ','
Settings are strings used as keys to query or change task behavior.
Most frequently used for wake words and command sets is operating-point.
Refer to the TrulyNatural SDK documentation for a complete list and
descriptions of all supported settings.
```
- The files specified by the `-i` and `-o` options must contain
exactly one audio file path per line, with no extraneous whitespace. The
line separator is the newline character, `\n`.
- Files specified by the `-c` option must be a comma-separated value (CSV)
file with exactly one audio file path and reference file path per line,
and no extraneous whitespace. Each line will have two comma-separated
fields. The first field is an audio file, and the second field is a text
file containing the reference (expected result). UTF-8 is supported.
- Some combination of `-c`, `-i`, and/or `-o` must be specified.
- `-c` and `-i` cannot be used together.
- `-w` requres `-c`.
- `-u` counts unexpected phrase spots in the in-vocabulary (FR) data
towards the false accept total. This only has an effect for spotters
that require a significant lead-in time to stabilize. This flag can only
be used when testing wake words and commands. (cannot be used with -w).
- The `-j` option determines the number of concurrent threads to start.
For multi-core CPUs this can significantly speed up the evaluation.
## Example
```console
% snsr-eval-batch -v -v -v -t alexa-fr.snsr \
-i inv.txt -o oov.txt -j 6 -s operating-point=10
Writing log to "alexa-fr.log"
INV: 2612 files, 23.128 hr, 23:07:39.285
OOV: 686 files, 142.984 hr, 142:59:01.345
Total: 3298 files, 166.111 hr, 166:06:40.630
Using operating point 10.
Available operating points: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20.
3298 files, 166.111 hr, 118 FA 0.83/hr, 3.33% FR, 2525 TA, 658.9x RT
```
- 3298 files processed.
- 166.111 hours of audio processed.
- 118 false accept spots, which is an FA rate of 0.83 per hour.
- 3.33% false reject ratio.
- 2525 true accept spots on in-vocabulary test audio.
- 658.9 real-time factor.
```console
% snsr-eval-batch -a -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr \
-c stt-16kHz-en-general-quicktest-full.csv -w -n -j 6
999 files, 1.281 hr, 9174 Words, 833 Substitutions, 198 Insertions, 120 Deletions, 12.546% WER, 5.3 xRT
```
- 999 files processed.
- 1.281 hours of audio processed.
- 9164 total words in test.
- 833 substitutions.
- 198 insertions.
- 120 deletions.
- 12.546% Word Error Rate.
- 5.3 real-time factor.
## Log file format
`snsr-eval-batch` produces a log file in plain text format. Each line in
this file follows the same pattern: `KEY [subkey] [detail]`.
KEY | subkey | detail | notes
-|-|-|-
CMDFR | "filename" | reference | false reject, no matches detected in this file
CMDSB | "filename" | start-ms end-ms "phrase" "reference" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | mismatch between command phrase and reference
CMDTA | "filename" | start-ms end-ms "phrase" "reference" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | true-accept with reference phrase checking
ERROR | error message | | unexpected error encountered
FR+SBCOUNT | double | | total number of false-reject + substitution errors
FR+SBRATIO | double | % | overall false-reject + substitution ratio
FACOUNT | integer | | total number of false-accept spots
FARATE | double | / hr | overall false-accept rate
FRCOUNT | double | | total number of false-reject errors
FRRATIO | double | % | overall false-reject ratio
INFO | start-time | YYYY-MM-DD HH:MM:SS.sss UTC | job start time in UTC
INFO | completion-time | YYYY-MM-DD HH:MM:SS.sss UTC | job end time in UTC
INFO | duration | double | total job duration in seconds
INFO | sdk-name | "TrulyHandsfree" or "TrulyNatural" |
INFO | sdk-version | version-string | snsr-eval-batch SDK version
INFO | command-line | command-line arguments | includes @c argv[0]
INFO | operating-point | integer | selected operating point
INFO | inv-files | integer | number of in-vocabulary (FR) test files
INFO | inv-seconds | integer | seconds of in-vocabulary audio
INFO | inv-hours | HHH:MM:SS.sss | inv-seconds as hours, minutes, seconds
INFO | oov-files | integer | number of out-of-vocabulary (FA) test files
INFO | oov-seconds |integer | seconds of out-of-vocabulary audio
INFO | oov-hours | HHH:MM:SS.sss | oov-seconds as hours, minutes, seconds
INFO | inv/oov-seconds |integer | seconds of OOV audio in FR test files
INFO | inv/oov-hours | HHH:MM:SS.sss | inv/oov-seconds as hours, minutes, seconds
INFO | rejected-files | integer | number of rejected files (not used in the test)
INFO | real-time-factor | double | total duration of audio processed divided by the processing time
INVFA | "filename" | start-ms end-ms "phrase" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | FA in in-vocabulary test file. This is a spot that happened during the `min-in-vocab-duration` lead-in period, or an additional, spurious, spot recognized in the in-vocabulary file.
INVFR | "filename" | | false reject, no spot in this file
INVTA | "filename" | start-ms end-ms "phrase" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | true accept
INVTX | "filename" | N spots | more than one spot in this file
OOVFA | "filename" | start-ms end-ms "phrase" [sv-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#sv-score) [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score) | FA in out-of-vocabulary test file
REJECT | "filename" | reason | filename was rejected as unusable
STTFR | "filename" | reference | false reject, no matches detected in this file
STTSB | "filename" | start-ms end-ms "phrase" "reference" word-count, substitutions, additions, deletions, word-error-rate | mismatch between LVCSR/STT phrase and reference
STTTA | "filename" | start-ms end-ms "phrase" "reference" word-count, substitutions, additions, deletions, word-error-rate | true-accept (no mismatch) between LVCSR/STT phrase and reference
TACOUNT | integer | | total number of true-accept spots
WER | double | % | overall word-error-rate
WER_DELETIONS | integer | | total number of WER deletions
WER_INSERTIONS | integer | | total number of WER insertions
WER_SUBSTITUTIONS | integer | | total number of WER substitutions
WER_WORDS | integer | | total number of WER words
*[API]: Application Programming Interface
*[FA]: False Accept: the recognizer triggered when the target phrase was not spoken
*[FR]: False Reject: the recognizer did not trigger when the target phrase was spoken
*[LVCSR]: Large Vocabulary Continuous Speech Recognition model, feed-forward neural net acoustic model with FST decoder
*[SDK]: Software Development Kit
*[STT]: Speech To Text: transformers with language model and CTC decoding
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/snsr-eval.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-eval/"
---
# snsr-eval
This tool evaluates / runs TrulyNatural SDK `snsr` model files.
It supports all [task types](https://doc.sensory.com/tnl/7.8/api/setting-keys/configuration.md#task-type), except wake word [enrollment](https://doc.sensory.com/tnl/7.8/api/setting-keys/values.md#enroll)
which is handled by [spot-enroll](https://doc.sensory.com/tnl/7.8/tools/spot-enroll.md#spot-enroll) and [live-enroll](https://doc.sensory.com/tnl/7.8/tools/live-enroll.md#live-enroll).
**Also see these related items:** [snsr-eval.c](https://doc.sensory.com/tnl/7.8/api/sample/c/snsr-eval.md#snsr-evalc)
## Usage
```
Evaluates/runs TrulyNatural SDK .snsr model files.
usage: snsr-eval -t task [options] [wavefile ...]
options:
-a : Add tpl-vad-lvcsr to LVCSR and STT models
-d directory : VAD audio output directory
-f setting filename : load filename into task setting
-g setting value : load string into task setting
-i listFile : run evaluation on each filename in listFile
-l [-l [-l]] : reduce verbosity
-o out : output filename for VAD audio or listFile results
-p [-p] : Enable pipeline profiling (experimental)
-q setting : query a task setting
-s setting=value : override a task setting
-t task : specify task filename (required)
-u filename : remove unused settings and save model to filename
-v [-v [-v]] : increase verbosity
Use a filename of - to read headerless linear 16-bit PCM little-endian
audio from stdin. If you don't specify any wave files, snsr-eval uses
live audio captured from the default audio device.
The -d and -o options are mutually exclusive. The output directory
must be writable. Audio files created by VAD segmentation are named
/.wav
Settings are strings used as keys to query or change task behavior.
Most frequently used are operating-point for wake words and command sets,
leading-silence and trailing-silence for VAD templates,
partial-result-interval for LVCSR and STT, and stt-profile for STT models.
Refer to the TrulyNatural SDK documentation for a complete list and
descriptions of all supported settings.
```
## Batch processing
If you specify the `-i listFile` option, `snsr-eval` will evaluate the model on the
filenames in `listFile`. This loads the model once and re-uses the [session](https://doc.sensory.com/tnl/7.8/api/inference.md#session) instance
for each evaluation, reducing overhead. It expects one filename per line.
In batch processing mode, `snsr-eval` produces output in [tab-separated value][tsv] format.
Each audio file in `listFile` has a corresponding result line in the output, unless the
processing the audio file results in an error. Such errors are treated as warnings and
printed to `stderr`. If you don't specify an output file with `-o`
output goes to `stdout` instead.
Output columns are, in order:
* File index, starting at `1`
* Audio filename
* If there is a recognition [result](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#result):
* Start alignment in ms, [begin-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#begin-ms)
* End alignment in ms, [end-ms](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#end-ms)
* Recognition score, [score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#score)
* Result hypothesis, [text](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#text)
* If there is an [nlu-intent](https://doc.sensory.com/tnl/7.8/api/setting-keys/events.md#nlu-intent):
* Intent name, [nlu-intent-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-name)
* Intent score, [nlu-intent-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-score)
* Intent value, [nlu-intent-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-intent-value)
* For each NLU entity found:
* Entity name, [nlu-entity-name](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-name)
* Entity score, [nlu-entity-score](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-score)
* Entity value, [nlu-entity-value](https://doc.sensory.com/tnl/7.8/api/setting-keys/results.md#nlu-entity-value)
## Examples
Fixed-phrase.
```console
% snsr-eval -t ./model/spot-hbg-enUS-1.4.0-m.snsr hbg_2.wav hbg_7.wav
1200 1905 hello blue genie
3855 4575 hello blue genie
```
Fixed-phrase on default audio capture device.
```console
% snsr-eval -v -t ./model/spot-hbg-enUS-1.4.0-m.snsr.snsr
Using live audio from default capture device. ^C to stop.
3180 3885 (1.00 sv) hello blue genie
9000 9720 (1.00 sv) hello blue genie
^C
```
Enrolled user-defined phrase.
```console
% snsr-eval -v -t ./three-users.snsr -s sv-threshold=0\
./data/enrollments/armadillo-1-4-c.wav ./data/enrollments/armadillo-6-0.wav\
./data/enrollments/terminator-2-5.wav ./data/enrollments/jackalope-1-4-c.wav
435 990 (0.89 sv) armadillo-1
5940 6630 (0.99 sv) terminator-2
8100 8610 (0.32 sv) jackalope-1
```
Lower the speaker-verification threshold.
```console
% snsr-eval -v -t ./three-users.snsr -s sv-threshold=0\
./data/enrollments/jackalope-1-5.wav ./data/enrollments/jackalope-1-5-c.wav
270 840 (0.56 sv) jackalope-1
2130 2610 (0.33 sv) jackalope-1
```
Recognize a list of audio files.
```console
% find data -name \*.wav > audio-files.txt
% snsr-eval -t model/stt-enUS-automotive-medium-2.3.15-pnc.snsr -o eval.tsv -i audio-files.txt
Processing file 50 of 50, 100.00%
```
[tsv]: https://en.wikipedia.org/wiki/Tab-separated_values "Tab-separated values"
*[API]: Application Programming Interface
*[NLU]: Natural Language Understanding model
*[SDK]: Software Development Kit
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/snsr-log-split.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/snsr-log-split/"
---
# snsr-log-split
Command-line splitter for logfiles generated by the debug template
[tpl-spot-debug](https://doc.sensory.com/tnl/7.8/models/index.md#tpl-spot-debug).
## Usage
```
Splits spotter log files into an event log, captured audio data,
and the source spotter model.
usage: snsr-log-split logfile [logfile ...]
options:
-d directory : output directory (default: .)
-v [-v [-v]] : increase verbosity
```
This will put an event log file (_.txt_), audio file (_.wav_),
and model file (_.snsr_) into the current directory, or into the output
directory if that was supplied.
## Example
```console
% snsr-log-split -d out debug.log
% ls out
debug.snsr debug.txt debug.wav
```
This will create components of _debug.log_ in subdirectory _out/_
(which must already exist.)
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/spot-convert.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/spot-convert/"
---
# spot-convert
Command-line phrase spotter model conversion tool, targeting Sensory's
deeply embedded DSP solutions running [THF Micro][].
**Also see these related items:** [spot-convert.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-convert.md#spot-convertc), [THF Micro][]
## Usage
```
Converts TrulyNatural SDK wake word models to THF Micro format.
usage: spot-convert -t task [options] target
options:
-a : convert all operating-points
-c : create .c output (in addition to .bin)
-o output : full prefix for output filenames
-p output-prefix : prefix for output filenames (default: task-target-)
-q slotname : model slot prefix
-s setting=value : override a task setting
-t task : set a task filename (required)
-v [-v [-v]] : increase verbosity
Output filenames are determined by the model parameters:
$(prefix) [-] [slot$(slotname)-] $(target)- $(version)-
op$(operating-point)- {dev,prod}- {net,search}.{bin,c,h}
where:
prefix specified by the -p option, or taken from the filename
of the task if -p isn't used.
version is the oldest DSP library that can run this model.
-dev- models are limited in runtime or number of recognition
events and should not be used in products.
-prod- models are not limited and ready for production use.
Use the -o option to override this filename pattern to:
$(prefix)[-]{net,search}.{bin,c,h}
The -o and -a options are mutually exclusive.
Output filenames are constrained to never start with "-"
Settings are strings used as keys to query or change task behavior.
Most frequently used for wake words and command sets is operating-point.
Refer to the TrulyNatural SDK documentation for a complete list and
descriptions of all supported settings.
```
## Examples
Convert a phrase spotter model to `pc38` format. This converts only the default
operating point for the model, point 10:
```console
% spot-convert -v -t model/spot-hbg-enUS-1.4.0-m.snsr pc38
operating-point: 10
production-ready: yes
wrote acoustic model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op10-prod-net.bin"
wrote search model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op10-prod-search.bin"
wrote search header to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op10-prod-search.h"
```
Convert a phrase spotter model to `pc38` format, overriding base filename:
```console
% spot-convert -v -v -t model/spot-hbg-enUS-1.4.0-m.snsr -p embedded pc38
target: pc38
basename: embedded-pc38-
operating-point: 10
production-ready: yes
wrote acoustic model (bin) to "embedded-pc38-3.4.0-op10-prod-net.bin"
wrote search model (bin) to "embedded-pc38-3.4.0-op10-prod-search.bin"
wrote search header to "embedded-pc38-3.4.0-op10-prod-search.h"
```
Convert a phrase spotter model to `pc38` format, overriding filename metadata:
```console
% spot-convert -v -v -t model/spot-hbg-enUS-1.4.0-m.snsr -o embedded pc38
target: pc38
basename: embedded-
operating-point: 10
production-ready: yes
wrote acoustic model (bin) to "embedded-net.bin"
wrote search model (bin) to "embedded-search.bin"
wrote search header to "embedded-search.h"
```
Convert all operating points, produce C code in addition to the binaries:
```console
% spot-convert -v -a -c -t model/spot-hbg-enUS-1.4.0-m.snsr pc38
operating-point: 1
production-ready: yes
wrote acoustic model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-net.bin"
wrote search model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-search.bin"
wrote search header to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-search.h"
wrote acoustic model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-net.c"
wrote search model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op01-prod-search.c"
operating-point: 2
production-ready: yes
wrote acoustic model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-net.bin"
wrote search model (bin) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-search.bin"
wrote search header to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-search.h"
wrote acoustic model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-net.c"
wrote search model (C) to "spot-hbg-enUS-1.4.0-m-pc38-3.4.0-op02-prod-search.c"
... etc ...
```
[THF Micro]: https://doc.sensory.com/thf-micro/ "THF Micro documentation"
*[API]: Application Programming Interface
*[THF]: TrulyHandsfree, Sensory's wake word and command recognition technology
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "tools/spot-enroll.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/tools/spot-enroll/"
---
# spot-enroll
Command-line wake word enrollment.
**Also see these related items:** [spot-enroll.c](https://doc.sensory.com/tnl/7.8/api/sample/c/spot-enroll.md#spot-enrollc)
## Usage
```
Enrolls TrulyNatural SDK wake words on audio files.
usage: spot-enroll -t task [options] [+user1 file1 [-c] file2 ...] [+user2 ...]
options:
-a adaptedfile : adapted enrollment context output filename
-c file : recording contains trailing context
-e enrolledfile : enrollment context output filename
-o out : enrolled model output filename (default: enrolled-sv.snsr)
-s setting=value : override a task setting
-t task : specify task filename (required)
-v [-v [-v]] : increase verbosity
Settings are strings used as keys to query or change task behavior.
Most frequently used for enrollment is accuracy, which takes a value
between 0 and 1.
Refer to the TrulyNatural SDK documentation for a complete list and
descriptions of all supported settings.
```
## Examples
Enroll two users.
```console
% spot-enroll -t ./model/udt-universal-3.67.1.0.snsr\
+armadillo-1\
./data/enrollments/armadillo-1-0.wav\
./data/enrollments/armadillo-1-1.wav\
./data/enrollments/armadillo-1-2.wav\
./data/enrollments/armadillo-1-3.wav\
+jackalope-4\
./data/enrollments/jackalope-4-0.wav\
./data/enrollments/jackalope-1-1.wav\
./data/enrollments/jackalope-1-2.wav\
./data/enrollments/jackalope-1-3.wav
```
Enroll a single user, with two enrollment recordings that
include trailing context:
```console
% spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\
+armadillo-1\
./data/enrollments/armadillo-1-0.wav \
-c ./data/enrollments/armadillo-1-0-c.wav\
./data/enrollments/armadillo-1-1.wav\
-c ./data/enrollments/armadillo-1-1-c.wav
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Enroll a user phrase, save the enrollment context to file.
```console
% spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\
-e armadillo-1-enrollments.snsr\
+armadillo-1\
./data/enrollments/armadillo-1-0.wav\
./data/enrollments/armadillo-1-1.wav\
./data/enrollments/armadillo-1-2.wav\
./data/enrollments/armadillo-1-3.wav
Enrollment context saved to "armadillo-1-enrollments.snsr"
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Enroll another user phrase, save the enrollment context to file.
```console
% spot-enroll -v -v -t ./model/udt-universal-3.67.1.0.snsr\
-e jackalope-1-enrollments.snsr\
+jackalope-1\
./data/enrollments/jackalope-1-0.wav\
./data/enrollments/jackalope-1-1.wav\
./data/enrollments/jackalope-1-2.wav\
./data/enrollments/jackalope-1-3.wav
Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-0.wav".
Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-1.wav".
Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-2.wav".
Enrolling user "jackalope-1" from file "./data/enrollments/jackalope-1-3.wav".
jackalope-1: 4 enrollments.
Enrollment context saved to "jackalope-1-enrollments.snsr"
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Combine two previously saved enrollment contexts with a third enrollment,
save the combined enrollment context. Speed up the adaptation by reducing
the accuracy.
```console
% spot-enroll -v -v -v -t ./model/udt-universal-3.67.1.0.snsr\
-t armadillo-1-enrollments.snsr\
-t jackalope-1-enrollments.snsr\
-e combined-enrollments.snsr -s accuracy=0.1\
+terminator-2\
./data/enrollments/terminator-2-0.wav\
./data/enrollments/terminator-2-1.wav\
./data/enrollments/terminator-2-2.wav\
./data/enrollments/terminator-2-3.wav
Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-0.wav".
Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-1.wav".
Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-2.wav".
Enrolling user "terminator-2" from file "./data/enrollments/terminator-2-3.wav".
armadillo-1: 4 enrollments.
jackalope-1: 4 enrollments.
terminator-2: 4 enrollments.
Enrollment context saved to "combined-enrollments.snsr"
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Re-enroll with full accuracy, save to three-users.snsr.
```console
% spot-enroll -v -v -v -t ./model/udt-universal-3.67.1.0.snsr\
-t combined-enrollments.snsr -o three-users.snsr
armadillo-1: 4 enrollments.
jackalope-1: 4 enrollments.
terminator-2: 4 enrollments.
Adapting: 100% complete.
Enrolled model saved to "three-users.snsr"
```
Delete an enrollment from a saved enrollment context.
```console
% spot-enroll -v -v -t ./model/udt-universal-3.67.1.0.snsr\
-t combined-enrollments.snsr -e two-enrollments.snsr\
-s delete-user=jackalope-1
armadillo-1: 4 enrollments.
terminator-2: 4 enrollments.
Enrollment context saved to "two-enrollments.snsr"
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Enroll two users separately, save adapted contexts, then combine the saved
contexts without re-adapting.
```console
% spot-enroll -v -v -t ./model/udt-universal-3.67.1.0.snsr\
-a armadillo-1-adapted.snsr\
+armadillo-1\
./data/enrollments/armadillo-1-0.wav\
./data/enrollments/armadillo-1-1.wav\
-c ./data/enrollments/armadillo-1-0-c.wav\
-c ./data/enrollments/armadillo-1-1-c.wav
Enrolling user "armadillo-1" from file "./data/enrollments/armadillo-1-0.wav".
Enrolling user "armadillo-1" from file "./data/enrollments/armadillo-1-1.wav".
Enrolling user "armadillo-1" with context from file "./data/enrollments/armadillo-1-0-c.wav".
Enrolling user "armadillo-1" with context from file "./data/enrollments/armadillo-1-1-c.wav".
armadillo-1: 4 enrollments.
Enrollment context saved to "armadillo-1-adapted.snsr"
Adapting: 100% complete.
Adapted enrollment context saved to "armadillo-1-adapted.snsr"
Enrolled model saved to "enrolled-sv.snsr"
% spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\
-a jackalope-1-adapted.snsr\
+jackalope-1\
./data/enrollments/jackalope-1-0.wav\
./data/enrollments/jackalope-1-1.wav\
-c ./data/enrollments/jackalope-1-0-c.wav\
-c ./data/enrollments/jackalope-1-1-c.wav
Enrollment context saved to "jackalope-1-adapted.snsr"
Adapting: 100% complete.
Adapted enrollment context saved to "jackalope-1-adapted.snsr"
Enrolled model saved to "enrolled-sv.snsr"
% spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\
-t ./armadillo-1-adapted.snsr\
-t ./jackalope-1-adapted.snsr
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
Load adapted contexts and force a re-adapt.
```console
% spot-enroll -v -t ./model/udt-universal-3.67.1.0.snsr\
-t ./armadillo-1-adapted.snsr\
-t ./jackalope-1-adapted.snsr\
-s re-adapt=1
Adapting: 100% complete.
Enrolled model saved to "enrolled-sv.snsr"
```
*[API]: Application Programming Interface
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology
---
source_path: "upgrade.md"
canonical_url: "https://doc.sensory.com/tnl/7.8/upgrade/"
---
# How to upgrade
The TrulyHandsfree and TrulyNatural SDKs are fully backwards-compatible with application
code and models from earlier releases.
To upgrade to a new version or variant (for example from TrulyHandsfree to TrulyNatural),
replace the library files and rebuild your application.
## C applications
Use these steps for all applications that link against `libsnsr.a`.
This includes those that use other application languages via an adapter, such as [iOS](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-ios).
1. Replace both `libsnsr.a` and `snsr.h` with their updated versions.
- Find `libsnsr.a` appropriate for your target platform in the _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/_ directory, and
`snsr.h` in _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/include/_
- [new](https://doc.sensory.com/tnl/7.8/api/inference.md#new) does a runtime check to verify that the library and its header are in sync;
if they're not, it will return the [LIBRARY_HEADER](https://doc.sensory.com/tnl/7.8/api/inference.md#rc) error code.
- For iOS, update the [XCFramework][] from _~/Sensory/TrulyNaturalSDK/7.9.0-pre.0/lib/ios/snsr.xcframework_
2. If you are using [custom library initialization](https://doc.sensory.com/tnl/7.8/faq.md#reduce-code-size) you must recreate
`snsr-custom-init.c` using [snsr-edit](https://doc.sensory.com/tnl/7.8/tools/snsr-edit.md#snsr-edit) from the new SDK version.
3. Rebuild your application.
## Android and Java applications
If you're using the recommended [Android build recipe](https://doc.sensory.com/tnl/7.8/api/build-system.md#build-android):
1. Edit `gradle.properties` and update `SNSR_REPOSITORY`, `SNSR_LIB_VERSION`,
and possibly `SNSR_LIB_TYPE`.
- The SDK installers publish the library artifacts in `mavenLocal()` too.
If you're using these, there's no need to update `SNSR_REPOSITORY`.
2. Rebuild your application.
[XCFramework]: https://developer.apple.com/documentation/xcode/creating-a-multi-platform-binary-framework-bundle
*[API]: Application Programming Interface
*[SDK]: Software Development Kit
*[TNL]: TrulyNatural, Sensory's large-vocabulary speech recognition technology