2017-12-01 17:00:49 zhenghongzhi6 阅读数 7789

本文首发于“洪流学堂”公众号。
洪流学堂,让你快人几步

源码地址

https://github.com/zhenghongzhi/WitBaiduAip

功能概述

1 语音识别

  1. 从麦克风录制音频
  2. AudioClip的音频数据转换为百度语音识别的PCM16格式
  3. 百度语音识别Restful接口的封装以及一个测试场景

2 语音合成

  1. 百度语音合成Restful接口的封装以及一个测试场景
  2. mp3格式运行时转为AudioClip进行播放

为什么不使用百度的C# SDK
百度的C# SDK使用了一些Unity不支持的特性,直接导入unity不能用
而且百度C# SDK只是封装了Restful的接口,功能上并没有增多
自己编写更简洁

更新说明

2018-08-22更新

根据百度API的更新,语音合成性能优化,直接使用原生格式,移除第三方插件

2018-03-28更新

加入平台判断,更好的支持android和ios

2018-01-11更新

在工程中加入了语音合成

2018-01-02更新

应广大小伙伴的要求,对工程进行了重构,放出github源码
https://github.com/zhenghongzhi/WitBaiduAip

2017-12-23更新

教程首发


洪流学堂,让你快人几步
欢迎关注“洪流学堂”微信公众号

2018-03-19 22:08:27 fengmao31 阅读数 176
  using (Stream stream = response.GetResponseStream())
        {
            buffer2 = new byte[stream.Length];
            stream.Read(buffer2, 0, buffer2.Length);

        }


stream.Length失败



解决方案 

1、用unity自带的www类

2、https://bbs.csdn.net/topics/360163784

byte[] result;
byte[] buffer = new byte[4096];
 
WebRequest wr = WebRequest.Create(someUrl);
 
using(WebResponse response = wr.GetResponse())
{
   using(Stream responseStream = response.GetResponseStream())
   {
      using(MemoryStream memoryStream = new MemoryStream())
      {
         int count = 0;
         do
         {
            count = responseStream.Read(buffer, 0, buffer.Length);
            memoryStream.Write(buffer, 0, count);
 
         } while(count != 0);
 
         result = memoryStream.ToArray();
 
      }
   }
}

2017-12-16 23:27:31 YongshuangZhao 阅读数 2602

Unity文字转语音功能的实现(注意:只适应于Windows10操作系统,需打开win10 Cortana 语音功能)

代码如下:详情请看Unity官网https://docs.unity3d.com/ScriptReference/Windows.Speech.KeywordRecognizer.html

(包含关键字识别,语法识别和听写识别)

using System.Collections;  
using System.Collections.Generic;  
using UnityEngine;  
using UnityEngine.Windows.Speech; 

public class AddSpeechTestNoneless : MonoBehaviour {  
	public string[] keyWords = new string[]{"确认","开始","返回","暂停"};  
	public ConfidenceLevel confidenLevel = ConfidenceLevel.Medium;  
	PhraseRecognizer recognizer;  
	void Start () {  
		recognizer = new KeywordRecognizer (keyWords, confidenLevel);  
		recognizer.OnPhraseRecognized += Display;  // 注册事件  
		recognizer.Start ();  
	}  

	public void Display(PhraseRecognizedEventArgs args){  
		string str = args.text;  
		Debug.Log (str.ToString ());  
	}  
}  

2018-07-26 01:43:08 luoyikun 阅读数 1295

转自洪流学堂
语音转文字
1.打开麦克风记录

_clipRecord = Microphone.Start(null, false, 30, 16000);

2.将Unity的AudioClip数据转化为PCM格式16bit数据

/// <summary>
        /// 将Unity的AudioClip数据转化为PCM格式16bit数据
        /// </summary>
        /// <param name="clip"></param>
        /// <returns></returns>
        public static byte[] ConvertAudioClipToPCM16(AudioClip clip)
        {
            var samples = new float[clip.samples * clip.channels];
            clip.GetData(samples, 0);
            var samples_int16 = new short[samples.Length];

            for (var index = 0; index < samples.Length; index++)
            {
                var f = samples[index];
                samples_int16[index] = (short) (f * short.MaxValue);
            }

            var byteArray = new byte[samples_int16.Length * 2];
            Buffer.BlockCopy(samples_int16, 0, byteArray, 0, byteArray.Length);

            return byteArray;
        }

3.将字节流上传到百度语音uri,得到转换后的文本

 public IEnumerator Recognize(byte[] data, Action<AsrResponse> callback)
        {
            yield return PreAction ();

            if (tokenFetchStatus == Base.TokenFetchStatus.Failed) {
                Debug.LogError("Token fetched failed, please check your APIKey and SecretKey");
                yield break;
            }

            var uri = string.Format("{0}?lan=zh&cuid={1}&token={2}", UrlAsr, SystemInfo.deviceUniqueIdentifier, Token);

            var form = new WWWForm();
            form.AddBinaryData("audio", data);
            var www = UnityWebRequest.Post(uri, form);
            www.SetRequestHeader("Content-Type", "audio/pcm;rate=16000");
            yield return www.SendWebRequest();

            if (string.IsNullOrEmpty(www.error))
            {
                Debug.Log(www.downloadHandler.text);
                callback(JsonUtility.FromJson<AsrResponse>(www.downloadHandler.text));
            }
            else
                Debug.LogError(www.error);
        }

文字转语音
1.文本上传百度语音uri,得到字节流

 public IEnumerator Synthesis(string text, Action<TtsResponse> callback, int speed = 5, int pit = 5, int vol = 5,
            Pronouncer per = Pronouncer.Female)
        {
            yield return PreAction();

            if (tokenFetchStatus == Base.TokenFetchStatus.Failed)
            {
                Debug.LogError("Token was fetched failed. Please check your APIKey and SecretKey");
                callback(new TtsResponse()
                {
                    err_no = -1,
                    err_msg = "Token was fetched failed. Please check your APIKey and SecretKey"
                });
                yield break;
            }

            var param = new Dictionary<string, string>();
            param.Add("tex", text);
            param.Add("tok", Token);
            param.Add("cuid", SystemInfo.deviceUniqueIdentifier);
            param.Add("ctp", "1");
            param.Add("lan", "zh");
            param.Add("spd", Mathf.Clamp(speed, 0, 9).ToString());
            param.Add("pit", Mathf.Clamp(pit, 0, 9).ToString());
            param.Add("vol", Mathf.Clamp(vol, 0, 15).ToString());
            param.Add("per", ((int) per).ToString());

            string url = UrlTts;
            int i = 0;
            foreach (var p in param)
            {
                url += i != 0 ? "&" : "?";
                url += p.Key + "=" + p.Value;
                i++;
            }

#if UNITY_STANDALONE || UNITY_EDITOR || UNITY_UWP
            var www = UnityWebRequest.Get(url);
#else
            var www = UnityWebRequestMultimedia.GetAudioClip(url, AudioType.MPEG);
#endif
            Debug.Log(www.url);
            yield return www.SendWebRequest();


            if (string.IsNullOrEmpty(www.error))
            {
                var type = www.GetResponseHeader("Content-Type");
                Debug.Log("response type: " + type);

                if (type == "audio/mp3")
                {
#if UNITY_STANDALONE || UNITY_EDITOR || UNITY_UWP
                    var clip = GetAudioClipFromMP3ByteArray(www.downloadHandler.data);
                    var response = new TtsResponse {clip = clip};
#else
                    var response = new TtsResponse {clip = DownloadHandlerAudioClip.GetContent(www) };
#endif
                    callback(response);
                }
                else
                {
                    Debug.LogError(www.downloadHandler.text);
                    callback(JsonUtility.FromJson<TtsResponse>(www.downloadHandler.text));
                }
            }
            else
                Debug.LogError(www.error);
        }

2.字节流转化为AudioClip播放

private AudioClip GetAudioClipFromMP3ByteArray(byte[] mp3Data)
        {
            var mp3MemoryStream = new MemoryStream(mp3Data);
            MP3Sharp.MP3Stream mp3Stream = new MP3Sharp.MP3Stream(mp3MemoryStream);

            //Get the converted stream data
            MemoryStream convertedAudioStream = new MemoryStream();
            byte[] buffer = new byte[2048];
            int bytesReturned = -1;
            int totalBytesReturned = 0;

            while (bytesReturned != 0)
            {
                bytesReturned = mp3Stream.Read(buffer, 0, buffer.Length);
                convertedAudioStream.Write(buffer, 0, bytesReturned);
                totalBytesReturned += bytesReturned;
            }

            Debug.Log("MP3 file has " + mp3Stream.ChannelCount + " channels with a frequency of " +
                      mp3Stream.Frequency);

            byte[] convertedAudioData = convertedAudioStream.ToArray();

            //bug of mp3sharp that audio with 1 channel has right channel data, to skip them
            byte[] data = new byte[convertedAudioData.Length / 2];
            for (int i = 0; i < data.Length; i += 2)
            {
                data[i] = convertedAudioData[2 * i];
                data[i + 1] = convertedAudioData[2 * i + 1];
            }

            Wav wav = new Wav(data, mp3Stream.ChannelCount, mp3Stream.Frequency);

            AudioClip audioClip = AudioClip.Create("testSound", wav.SampleCount, 1, wav.Frequency, false);
            audioClip.SetData(wav.LeftChannel, 0);

            return audioClip;
        }
2018-03-23 14:16:08 fcc7927752836 阅读数 228

Unity3D平台使用图片识别+语音合成技术的一个Demo

主要功能是通过手机摄像头获取拍摄画面,在通过图片识别解析图片中的字符,在通过语音合成将字符合成语音,最后播放出来

- 开发环境:

百度安卓API工程可用Android Studio直接打开,也可以直接使用编译好的DLL直接导入Unity3D,密码:sdgc

Unity3D工程,本人还在整理中,主要是写的太随意,等整理好会把完整工程献上


在想是不是可以录个视频,真不适合写东西~

Unity3D屏幕特效合成

阅读数 1212

没有更多推荐了,返回首页