zoukankan      html  css  js  c++  java
  • 如何开发一个在线朗读的功能----科大讯飞语音合成实战

    -- 很久没写技术博客,心血来潮,准备继续捡起。

    起因

    天天学习强国,不过强国APP的语音朗读不错,了解之后是科大讯飞支持的,于是开始撸码。https://www.xfyun.cn/doc/tts/online_tts/API.html

    注册为开发者,接口要求这些我就不赘述了,文档里面写的清楚。当然具体实现是另外一回事。

    听了一下效果,怎么说呢,免费的和特色的还是有很大的差别的,免费的是刚好让你能忍的那个级别,特色的和真人差别不大。看了一下收费,分为两部分,一部分是接口费用,一部分是特色发音人的费用。基于撸码的习惯,一切先从免费开始。

      详情请看这里:https://www.xfyun.cn/services/online_tts

     开干

    看了一圈没有C#的demo,这就尴尬了,虽然是有文档,但是大家都懂,好比微信公众号的开发文档,要变成实际的代码,看得见的应用那是要废一番功夫的。找了一番之后,终于发现一个开源的项目刚发布没多久,真是喜出望外就开干了: https://github.com/zuiyuewentian/XunFeiNETSDK

    讯飞的这个接口是基于websock的,我们先用控制台程序做一个demo。C#其实自带了websocket,不过这里用的是WebSocketSharp,这个我觉得很好,System.Net.WebSockets.WebSocket 是基于异步方法的,后面我会讲到,而WebSocketSharp.WebSocket 是基于事件的,很符合前端的编程习惯。

    websocket = new WebSocketSharp.WebSocket(reqUrl);
                    websocket.OnMessage += Websocket_OnMessage;
                    websocket.OnOpen += Websocket_OnOpen;
                    websocket.Connect();

    讯飞的服务器收到我们的文字内容后,会以流的形势把音频传回来,在我们的服务器上把这种流转成文件即可。

     private static Stopwatch stopwatch;
            public static void Main(string[] args)
            {
                //text要合成的文字,pathUrl域名
                stopwatch = new Stopwatch();
                stopwatch.Start();
                var xunFeiNetSdk = new XunFeiTTS();
                xunFeiNetSdk.MessageUpdate_Event += XunFeiNetSdk_MessageUpdate_Event;
                xunFeiNetSdk.SendData("张家界荷花国际机场,北京大兴机场,长沙黄花机场,邵阳武冈机场,所有航班全部复航!");
                Console.Read();
            }
    
            static byte[] data = new byte[0];
            private static void XunFeiNetSdk_MessageUpdate_Event(TTS_Data_Model message, string error = null)
            {
    
                if (error != null)
                {
                    Console.WriteLine(error);
                    return;
                }
    
                try
                {
                    //合成结束
                    if (message.status == 2)
                    {
                        Console.WriteLine("合成成功");
                        string voice = string.Format("{0}.wav", DateTime.Now.ToString("yyyyMMddHHmmssfff"));
    
                        Console.WriteLine("正在保存..."+voice);
                        
                        data = data.Concat(message.audioStream).ToArray();
    
                        var mWavWriter = new WaveFileWriter(voice, new WaveFormat(16000, 1));
                        mWavWriter.Write(data, 0, data.Length);
                        mWavWriter.Close();
                        mWavWriter.Dispose();
                        Console.WriteLine("保存成功...");
                        var sp = stopwatch.Elapsed;
    
                        Console.WriteLine("用时" + sp);
    
    
                    }
                    else
                    {
                        data = data.Concat(message.audioStream).ToArray();
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.Message);
                }
            }
    
        }

    文件的存储用的是NAudio,XunFeiNETSDK里面的代码我独立出来。

    (最近2个月航班太少了,工资骤减,原谅我说出我的内心话) 

    这样就得到了语音了。听一听,还能接受。但是怎么做到web页面里面呢?

    改造成web应用

    首先的思路是,前端把文字发过来,然后交给sdk去获取音频,得到文件的地址后返回给前端。所以最合适的方案还是前端也用websocket,因为发送消息和收到消息是分开的。那么这又需要后端有一个websocket服务了

     

    我又不想单独去开一个websocket服务,那就可以将这个websocket做成api的形式,如下:

    namespace HHOA.MVC5.Controllers.API
    {
        [RoutePrefix("api/msg")]
        public class MsgApiController : ApiController
        {
            private static List<WebSocket> _sockets = new List<WebSocket>();
            private readonly  XunFeiTTS _xunFei;
            private WebSocket currentSocket = null;
    
            public MsgApiController()
            {
                _xunFei = new XunFeiTTS();
                _xunFei.MessageUpdate_Event += XunFeiNetSdk_MessageUpdate_Event;
                Logger.Info("启动XunFeiTTS");
                
            }
    
    
            private byte[] data = new byte[0];
            private void XunFeiNetSdk_MessageUpdate_Event(TTS_Data_Model message, string error = null)
            {
    
                if (error != null)
                {
                    Console.WriteLine(error);
                    return;
                }
                WaveFileWriter mWavWriter=null;
                try
                {
                    //合成结束
                    if (message.status == 2)
                    {
                        Logger.Info("合成成功");
                        var savePath = HostingEnvironment.MapPath("~/Files/Voice/");
                        string diff = DateTime.Now.ToString("yyyyMMddHHmmssfff");
                        string voice = string.Format("{0}.wav", diff);
    
                        var filePath = savePath + voice;
    
                        var di = new DirectoryInfo(savePath);
                        if (!di.Exists) { di.Create(); }
    
                        var webPath = "/Files/Voice/" + voice;
    
    
                        Logger.Info("正在保存..." + filePath);
    
                        data = data.Concat(message.audioStream).ToArray();
    
                         mWavWriter = new WaveFileWriter(filePath, new WaveFormat(16000, 1));
                        mWavWriter.Write(data, 0, data.Length);
                        mWavWriter.Close();
                        mWavWriter.Dispose();
    
                        Logger.Info("保存成功...");
    
                        //将音频地址发给前端
                        if (currentSocket != null && currentSocket.State == WebSocketState.Open)
                        {
                            var recvBytes = Encoding.UTF8.GetBytes("voice:" + webPath);
                            var sendBuffer = new ArraySegment<byte>(recvBytes);
                            currentSocket.SendAsync(sendBuffer, WebSocketMessageType.Text, true, CancellationToken.None);
                        }
    
                    }
                    else
                    {
                        data = data.Concat(message.audioStream).ToArray();
                    }
                }
                catch (Exception ex)
                {
                    if (mWavWriter != null)
                    {
                        mWavWriter.Dispose();
                    }
                    Logger.Debug(ex.Message);
                }
            }
    
    
    
    
            [Route]
            [HttpGet]
            public HttpResponseMessage Connect()
            {
                HttpContext.Current.AcceptWebSocketRequest(ProcessRequest); //在服务器端接受Web Socket请求,传入的函数作为Web Socket的处理函数,待Web Socket建立后该函数会被调用,在该函数中可以对Web Socket进行消息收发
    
                return Request.CreateResponse(HttpStatusCode.SwitchingProtocols); //构造同意切换至Web Socket的Response.
            }
    
            public async Task ProcessRequest(AspNetWebSocketContext context)
            {
                var socket = context.WebSocket;//传入的context中有当前的web socket对象
                _sockets.Add(socket);//此处将web socket对象加入一个静态列表中
    
                //进入一个无限循环,当web socket close是循环结束
                while (true)
                {
                    var buffer = new ArraySegment<byte>(new byte[1024]);
                    var receivedResult = await socket.ReceiveAsync(buffer, CancellationToken.None);//对web socket进行异步接收数据
                    if (receivedResult.MessageType == WebSocketMessageType.Close)
                    {
                        await socket.CloseAsync(WebSocketCloseStatus.Empty, string.Empty, CancellationToken.None);//如果client发起close请求,对client进行ack
                        _sockets.Remove(socket);
                        break;
                    }
    
                    if (socket.State == WebSocketState.Open)
                    {
                        //收到了消息
                        string recvMsg = Encoding.UTF8.GetString(buffer.Array, 0, receivedResult.Count);
                        //将这个消息发送给xf
                        Logger.Info("收到消息:"+recvMsg);
                        _xunFei.SendData(recvMsg);
    
    
                        var recvBytes = Encoding.UTF8.GetBytes(recvMsg);
                        var sendBuffer = new ArraySegment<byte>(buffer.Array);
                        currentSocket = socket;
    
                        await socket.SendAsync(sendBuffer, WebSocketMessageType.Text, true, CancellationToken.None);
    
                     
                    }
                }
            }
        }
    
    }
    View Code
     var webSocket;
            var player = document.getElementById("player");
            function sendSocketMsg() {
                var msg = $("#msg").val();
                webSocket.send(msg);
                showMsg("发送消息:" + msg, "blue");
            }
    
            openSocket();
    
            function openSocket() {
                if (webSocket != null && typeof (webSocket) != "undefined") {
                    closeSocket();
                }
                webSocket = new WebSocket("ws://" + location.hostname + ":" + location.port + "/api/msg");
                webSocket.onopen = function () {
                    showMsg("连接建立");
                }
                webSocket.onerror = function () {
                    showMsg("发生异常");
                }
    
                webSocket.onmessage = function (event) {
                    showMsg("收到消息:" + event.data, "yellow");
                    if (event.data.indexOf("voice:") > -1) {
                        var src = event.data.split("voice:")[1];
                        player.src = src;
                        player.play();
                    }
                }
    
                webSocket.onclose = function () {
                    showMsg("连接关闭");
                }
            }
    
            function closeSocket() {
                if (webSocket != null && typeof (webSocket) != "undefined") {
                    webSocket.close();
                }
            }
    
            function showMsg(msg, type) {
                if (type === null || typeof (type) === "undefined") type = "gray";
                $("#show").append("<span class='" + type + "'>" + msg + "</span><br>");
            }

    这样就得到产品的雏形了。后续要考虑的是文字的长短、音频播放器的展示效果,还能换一下播放的声音等等,每次给你说一个功能,其实这个功能背后有太多细节了。

     Console版源码:https://download.csdn.net/download/stoneniqiu/12347028 

     Web版源码:https://download.csdn.net/download/stoneniqiu/12347167 

    没有积分的可以关注我的订阅号,回复语音合成。

  • 相关阅读:
    Log4Net记录到MySql
    创建快照
    grep的用法(CentOS7)及有关正则表达式的使用
    samba
    mkdir
    raid0和raid5的 实验过程
    route
    source和sh执行脚本时的差异
    echo命令的简单用法和实例
    smbpasswd和pdbedit
  • 原文地址:https://www.cnblogs.com/stoneniqiu/p/12743974.html
Copyright © 2011-2022 走看看