微软Hololens学院教程Hologram 212Voice（语音）【微软教程已经更新，本文是老版本】

zoukankan html css js c++ java

微软Hololens学院教程Hologram 212Voice（语音）【微软教程已经更新，本文是老版本】
这是老版本的教程，为了不耽误大家的时间，请直接看原文，本文仅供参考哦！原文链接：https://developer.microsoft.com/EN-US/WINDOWS/HOLOGRAPHIC/holograms_212

语音输入是我们操作全息对象的另一种交互方式，语音指令在实际操作过程中是非常自然和容易的，设计语音指令需要考虑以下几点：
- 自然的
- 容易记住的
- 上下文一致
- 与同一上下文中的其他选项有足够的区别
在Holograms 101的教程里，已经使用关键字识别构建了两个简单的语音指令，这节教程将更深入的学习语音输入相关的知识：
- 设计为HoloLens语音引擎优化的语音指令。
- 使用户知道什么语音指令可用.
- 确认Hololens已经听到了用户的语音指令.
- 使用听写识别器（Dictation Recognizer）来输出用户说了什么 .
- 使用语法识别器（Grammar Recognizer）来监听基于SRGS（语音识别语法规范文件）的命令。
这节教程使用的项目依然是上两节教程Holograms 210 and Holograms 211完成的项目。

项目文件：

下载此项目所需文件 files .

Unity 设置
- 打开 Unity.
- 选择 Open.
- 定位到你刚下载的项目 HolographicAcademy-Holograms-212-Voice 文件夹.
- 找到并选中 Starting\ModelExplorer 文件夹.
- 点击选择此文件夹 Select Folder.
- Unity打开后，在 Project 面板下选择 Scenes 文件夹.
- 双击 ModelExplorer 场景将其打开
- 接下来可以尝试将目前的项目发布部署到你的Hololens上，具体发布方法参见上几次课程.
章节1 注意点（Awareness）

语音指令的设计需要考虑一些因素，下面是一些建议：

需要做的：
- 创建简洁的指令. 你不会想用下面这段指令 "Play the currently selected video", 因为该命令不简洁，很容易被用户忘记。相反，您应该使用: "Play Video", 因为它是简明的并且有多个音节.
- 使用简单的单词词汇. 始终尝试使用那些容易让用户发现和记住的常用单词和短语。例如，如果您的应用程序有一个需要显示或隐藏的文本对象，则不要使用命令“Show Placard”，因为“placard”是一个很少使用的词汇. 相反，你应该使用如下指令: "Show Note", 来显示你的文本对象.
- 保持一致性. 语音命令应该在您的应用程序中保持一致。想象一下，在应用程序中有两个场景，两个场景都包含一个用于关闭应用程序的按钮。如果第一个场景使用命令“Exit”来触发按钮，但第二个场景使用命令“Close App”，那么用户会很困惑。如果相同的功能在多个场景中持续存在，则应使用相同的语音命令来触发它。
不要做的：
- 使用单个音节命令。例如，如果你创建一个语音命令来播放视频，你应该避免使用简单的命令“Play”，因为它只是一个单个的音节，这样很容易地被系统错过。相反，您应该使用：“Play Video”，因为它是简明的并且有多个音节。
- 使用系统命令。 “Select”命令由系统保留以触发当前聚焦对象的Tap事件。不要在关键字或短语中重复使用“Select”命令，因为它可能无法按照你的预期工作。例如，你在你的应用中想选中一个立方体使用语音命令是“Select cube”，但用户在发出命令时正在看着一个球体，那么系统则会选择此球体。应用的顶部主工具栏也可以使用语音指令。但请不要在你应用的主视窗中使用以下语音命令：
  
  Go Back
  
  Scroll Tool
  
  Zoom Tool
  
  Drag Tool
  
  Adjust
  
  Remove
- 使用相似的发音. 避免使用韵脚的词汇.如果你有一个商店应用支持 "Show Store" 和 "Show More" 语音指令, 那么当其中一个指令执行的时候应该禁用另一个指令.例如, 你使用 "Show Store" 指令来开启商店, 然后在商店显示后禁用该指令以便接下来可以使用 "Show More" 指令打开更多浏览页面.
步骤：
- 在Unity的 Hierarchy 面板上方的输入框输入holoComm_screen_mesh .
- 双击 holoComm_screen_mesh 对象以便在场景中查看. 它是航天员的手表, 它代表了语音接口的指示面板.
- 在 Inspector 面板中, 找到 Keyword Manager (Script) 脚本组件.
- 展开 Keywords and Responses 部分查看当前支持的语音指令: Open Communicator.
- 双击 KeywordManager.cs 脚本用VS打开.
- 查看其代码理解如何使用 KeywordRecognizer 来添加语音指令并使用代理进行响应。
发布：

具体发布详情参见前两节教程
- 在项目部署到 HoloLens上以后, 使用air-tap 手势点击适配盒.
- 转动头部将凝视点对准宇航员的手表.
- 当手表被聚焦到，光标会变成一个麦克风 microphone. 这表明这里可以使用语音进行交互.
- 手表上方会出现一个文本指示面板. 帮助用户了解使用什么指令，这里是 "Open Communicator" .
- 接下来凝视手表然后说 "Open Communicator" 打开通信面板。
章节2 确认（Acknowledgement）

这节主要实现应用可以录入用户的声音，并在记录时提供动画反馈表明正在记录

注意点：

在Unity中的应用设置时 Microphone 功能必须开启 . 这个教程项目Holograms 212已经开启了麦克风功能, 但是在你自己的开发应用过程中不要忘记这一点.
1. 在 Unity Editor, 打开 "Edit > Project Settings > Player"
2. 在Inspector面板中点击 "Windows Store" 按钮
3. 展开 "Publishing Settings > Capabilities" 部分, 勾选上 Microphone 。
步骤：
- 在Unity的 Hierarchy 面板下, 找到 holoComm_screen_mesh 对象并选中它.
- 在右侧的 Inspector 面板中找到 Astronaut Watch (Script) 脚本组件.
- 点击 Communicator Prefab 属性里的蓝色方块代表到预设体.
- 现在在 Project 面板里，这个 Communicator 预设被聚焦到了.
- 点击Project 面板里的这个 Communicator 预设然后在其Inspector 面板里查看它的组件.
- 看一下 Microphone Manager (Script) 组件, 这使得应用可以记录用户的声音.
- 同时可以看到这个 Communicator 预设有 Keyword Manager (Script) 脚本组件，这个组件用来实现向系统发送语音指令 “Send Message” 的功能.
- 看一下 Communicator (Script) 脚本组件然后双击用VS打开.
- 这个Communicator.cs脚本是负责在通信器面板上设置正确的按钮状态。这将允许我们的用户记录自己的语音，回放语音，并发送语音消息给宇航员。它还将启动和停止播放动画波形，使得用户来判断他们的声音是否被听到。
- 在Communicator.cs脚本的Start方法里，删除如下代码，使得通信面板上的按钮可以启用
// TODO: 2.a Delete the following two lines: RecordButton.SetActive(false); MessageUIRenderer.gameObject.SetActive(false);

Communicator.cs
发布测试：
- 将凝视点聚焦到宇航员的手表上，说 "Open Communicator" 打开通信面板.
- 点击 Record 按钮 (麦克风) 则开始为宇航员记录语言信息.
- 当你开始说话的时候，你会发现波形动画被开启，这表明系统听到了你的声音.
- 点击 Stop 按钮 (左边的方框), 你会注意到波形动画停止执行.
- 点击 Play 按钮 (右边的三角形) 可以回放你刚刚录入的语音.
- 点击 Stop 按钮 (右边的方框) 来停止回放.
- 说 "Send Message" 可以关闭通信面板同时收到一个宇航员发来的 'Message Received' 的反馈.
章节3 理解与听写识别（Understanding and the Dictation Recognizer）

这一章主要使用听写识别器来将用户录入的声音转换为文字，并显示在通信面板上。

使用听写识别器（Dictation Recognizer）需要考虑到以下几点：

1 必须使Hololens连接到WiFi才能使听写识别器工作。
2 超时发生在设定的时间段之后。有两个超时要注意：
3 如果识别器启动，并且在前五秒没有听到任何音频，它将超时。
4 如果识别器听写过程中，听到了二十秒的静音，它将超时。
5 一次只能运行一种类型的识别器（关键字识别器或听写识别器）。

同样这个功能也需要Unity的项目设置中开启麦克风功能。在您个人的开发项目中不要忘记这一点。

步骤：

接下来需要重新编辑MicrophoneManager.cs脚本，完成以下几点：
1. 当按下录制按钮时，我们将启动听写识别器.
2. 显示出听写识别器所理解的假设文字.
3. 锁定听写识别器所理解的结果.
4. 检查听写识别器的超时。
5. 按停止按钮或麦克风会话超时时，停止听写识别器.
6. 重启关键字识别器 KeywordRecognizer, 用户可以通过 Send Message 指令发送信息.
using HoloToolkit; using System.Collections; using System.Text; using UnityEngine; using UnityEngine.UI; using UnityEngine.Windows.Speech; public class MicrophoneManager : MonoBehaviour { [Tooltip("A text area for the recognizer to display the recognized strings.")] public Text DictationDisplay; private DictationRecognizer dictationRecognizer; // Use this string to cache the text currently displayed in the text box. private StringBuilder textSoFar; // Using an empty string specifies the default microphone. private static string deviceName = string.Empty; private int samplingRate; private const int messageLength = 10; // Use this to reset the UI once the Microphone is done recording after it was started. private bool hasRecordingStarted; void Awake() { /* TODO: DEVELOPER CODING EXERCISE 3.a */ // 3.a: Create a new DictationRecognizer and assign it to dictationRecognizer variable. dictationRecognizer = new DictationRecognizer(); // 3.a: Register for dictationRecognizer.DictationHypothesis and implement DictationHypothesis below // This event is fired while the user is talking. As the recognizer listens, it provides text of what it's heard so far. dictationRecognizer.DictationHypothesis += DictationRecognizer_DictationHypothesis; // 3.a: Register for dictationRecognizer.DictationResult and implement DictationResult below // This event is fired after the user pauses, typically at the end of a sentence. The full recognized string is returned here. dictationRecognizer.DictationResult += DictationRecognizer_DictationResult; // 3.a: Register for dictationRecognizer.DictationComplete and implement DictationComplete below // This event is fired when the recognizer stops, whether from Stop() being called, a timeout occurring, or some other error. dictationRecognizer.DictationComplete += DictationRecognizer_DictationComplete; // 3.a: Register for dictationRecognizer.DictationError and implement DictationError below // This event is fired when an error occurs. dictationRecognizer.DictationError += DictationRecognizer_DictationError; // Query the maximum frequency of the default microphone. Use 'unused' to ignore the minimum frequency. int unused; Microphone.GetDeviceCaps(deviceName, out unused, out samplingRate); // Use this string to cache the text currently displayed in the text box. textSoFar = new StringBuilder(); // Use this to reset the UI once the Microphone is done recording after it was started. hasRecordingStarted = false; } void Update() { // 3.a: Add condition to check if dictationRecognizer.Status is Running if (hasRecordingStarted && !Microphone.IsRecording(deviceName) && dictationRecognizer.Status == SpeechSystemStatus.Running) { // Reset the flag now that we're cleaning up the UI. hasRecordingStarted = false; // This acts like pressing the Stop button and sends the message to the Communicator. // If the microphone stops as a result of timing out, make sure to manually stop the dictation recognizer. // Look at the StopRecording function. SendMessage("RecordStop"); } } /// <summary> /// Turns on the dictation recognizer and begins recording audio from the default microphone. /// </summary> /// <returns>The audio clip recorded from the microphone.</returns> public AudioClip StartRecording() { // 3.a Shutdown the PhraseRecognitionSystem. This controls the KeywordRecognizers PhraseRecognitionSystem.Shutdown(); // 3.a: Start dictationRecognizer dictationRecognizer.Start(); // 3.a Uncomment this line DictationDisplay.text = "Dictation is starting. It may take time to display your text the first time, but begin speaking now..."; // Set the flag that we've started recording. hasRecordingStarted = true; // Start recording from the microphone for 10 seconds. return Microphone.Start(deviceName, false, messageLength, samplingRate); } /// <summary> /// Ends the recording session. /// </summary> public void StopRecording() { // 3.a: Check if dictationRecognizer.Status is Running and stop it if so if (dictationRecognizer.Status == SpeechSystemStatus.Running) { dictationRecognizer.Stop(); } Microphone.End(deviceName); } /// <summary> /// This event is fired while the user is talking. As the recognizer listens, it provides text of what it's heard so far. /// </summary> /// <param name="text">The currently hypothesized recognition.</param> private void DictationRecognizer_DictationHypothesis(string text) { // 3.a: Set DictationDisplay text to be textSoFar and new hypothesized text // We don't want to append to textSoFar yet, because the hypothesis may have changed on the next event DictationDisplay.text = textSoFar.ToString() + " " + text + "..."; } /// <summary> /// This event is fired after the user pauses, typically at the end of a sentence. The full recognized string is returned here. /// </summary> /// <param name="text">The text that was heard by the recognizer.</param> /// <param name="confidence">A representation of how confident (rejected, low, medium, high) the recognizer is of this recognition.</param> private void DictationRecognizer_DictationResult(string text, ConfidenceLevel confidence) { // 3.a: Append textSoFar with latest text textSoFar.Append(text + ". "); // 3.a: Set DictationDisplay text to be textSoFar DictationDisplay.text = textSoFar.ToString(); } /// <summary> /// This event is fired when the recognizer stops, whether from Stop() being called, a timeout occurring, or some other error. /// Typically, this will simply return "Complete". In this case, we check to see if the recognizer timed out. /// </summary> /// <param name="cause">An enumerated reason for the session completing.</param> private void DictationRecognizer_DictationComplete(DictationCompletionCause cause) { // If Timeout occurs, the user has been silent for too long. // With dictation, the default timeout after a recognition is 20 seconds. // The default timeout with initial silence is 5 seconds. if (cause == DictationCompletionCause.TimeoutExceeded) { Microphone.End(deviceName); DictationDisplay.text = "Dictation has timed out. Please press the record button again."; SendMessage("ResetAfterTimeout"); } } /// <summary> /// This event is fired when an error occurs. /// </summary> /// <param name="error">The string representation of the error reason.</param> /// <param name="hresult">The int representation of the hresult.</param> private void DictationRecognizer_DictationError(string error, int hresult) { // 3.a: Set DictationDisplay text to be the error string DictationDisplay.text = error + "\nHRESULT: " + hresult; } private IEnumerator RestartSpeechSystem(KeywordManager keywordToStart) { while (dictationRecognizer != null && dictationRecognizer.Status == SpeechSystemStatus.Running) { yield return null; } keywordToStart.StartKeywordRecognizer(); } }

MicrophoneManager
发布测试：
- 凝视宇航员的手表，说“Open Communicator”。
- 选择Record按钮（麦克风）以录制您的消息。
- 开始说话。听写识别器将解释您的语音并在通信器面板中显示假设的文本。
- 在录制消息时尝试说“Send Message”。你会注意到关键字识别器不响应，因为听写识别器仍处于活动状态。
- 停止说话几秒钟。观看听写识别器完成其假设并显示最终结果。
- 开始说话，然后停顿20秒。这将导致听写识别器超时。
- 你会发现在上述超时后关键字识别器开始重新启用。通信器现在将响应语音命令。
- 说“Send Message”就可以将发送消息给宇航员。
章节4 语法识别器（Grammar Recognizer）

这章将使用Grammar Recognizer来识别用户的语音，这需要借鉴SRGS（语音识别语法规范）文件。

同样这个功能也需要Unity的项目设置中开启麦克风功能。在您个人的开发项目中不要忘记这一点。

步骤：
1. 在Unity的 Hierarchy 面板中, 搜索 Jetpack_Center 对象然后选中它.
2. 查看Inspector 面板中的Interactible Action 脚本.
3. 点击 Object To Tag Along 属性右边的小圆圈.
4. 在弹出的小窗口上的搜索框输入 SRGSToolbox 然后从列表中选中它.
5. 点击这个SRGSToolbox预设，在Project面板中下的StreamingAssets 文件夹下查看 SRGSColor.xml 文件.
- 这个SRGS 可以在 W3C 网站上找到 here.
- 在我们的 SRGS 文件, 我们有三个规则:
  
  颜色规则是你可以说列表中12个颜色的任意一个.
  
  组合规则是其可以监听颜色与形状的组合指令.
  
  根规则（颜色选择）, 它可以监听任意 "颜色+ 形状" 的组合. 形状指令可以在颜色指令之前，也可以在颜色指令之后，同时可以一次说最多三个形状. 这是唯一被监听的规则，因为它被指定为初始<grammar>标记中文件顶部的根规则。
发布测试：

1 在Unity中重建应用程序，然后从Visual Studio构建和部署以在HoloLens上体验应用程序。
2 凝视宇航员的喷气背包，并执行air tap手势。
3 开始说话。语法识别器将解释您的语音，并根据识别结果更改形状的颜色。示例命令是“blue circle, yellow square”。
4 再一次使用air tap手势以关闭工具箱。

原文地址：https://developer.microsoft.com/en-us/windows/holographic/holograms_212

如有翻译上的理解与错误请指正，谢谢哦！
查看全文

相关阅读:
Omi框架学习之旅
 Omi框架学习之旅
 Omi框架学习之旅
 加密解密
 RSA加密解密
 CMDB后台管理(AutoServer)
CMDB Autoclient思路分析
 CMDB开发(需求分析)
Django之model操作(续)
Django之Model操作

原文地址：https://www.cnblogs.com/qichun/p/6059867.html

微软Hololens学院教程Hologram 212Voice（语音）【微软教程已经更新，本文是老版本】

项目文件：

Unity 设置

章节1 注意点（Awareness）

需要做的：

不要做的：

步骤：

发布：

章节2 确认（Acknowledgement）

注意点：

步骤：

发布测试：

章节3 理解与听写识别（Understanding and the Dictation Recognizer）

步骤：

发布测试：

章节4 语法识别器（Grammar Recognizer）

步骤：

发布测试：