精华内容
下载资源
问答
  • OpenEars 包括离线语音处理等等 http://www.politepix.com/openears/ ...Welcome ... to OpenEars: free speech recognition and speech synthesis for the iPhone Introduction Installation Basic conce

    OpenEars   包括离线语音处理等等

    http://www.politepix.com/openears/


    If you aren't quite ready to read the documentation, visit the quickstart tool so you can get started with OpenEars in just a few minutes! You can come back and read the docs or theFAQ once you have specific questions.

    Introduction

    OpenEars is an shared-source iOS framework for iPhone voice recognition and speech synthesis (TTS). It lets you easily implement round-trip English language speech recognition and text-to-speech on the iPhone and iPad and uses the open source CMU Pocketsphinx, CMU Flite, and CMUCLMTK libraries, and it is free to use in an iPhone or iPad app. It is the most popular offline framework for speech recognition and speech synthesis on iOS and has been featured in development books such as O'Reilly's Basic Sensors in iOS by Alasdair Allan and Cocos2d for iPhone 1 Game Development Cookbook by Nathan Burba.


    Highly-accurate large-vocabulary recognition (that is, trying to recognize any word the user speaks out of many thousands of known words) is not yet a reality for local in-app processing on the iPhone given the hardware limitations of the platform; even Siri does its large-vocabulary recognition on the server side. However, Pocketsphinx (the open source voice recognition engine that OpenEars uses) is capable of local recognition on the iPhone of vocabularies with hundreds of words depending on the environment and other factors, and performs very well with command-and-control language models. The best part is that it uses no network connectivity because all processing occurs locally on the device.

    The current version of OpenEars is 1.2.4. Download OpenEars 1.2.4or read its changelog.

    Features of OpenEars

    OpenEars can:

    • Listen continuously for speech on a background thread, while suspending or resuming speech processing on demand, all while using less than 4% CPU on average on an iPhone 4(decoding speech, text-to-speech, updating the UI and other intermittent functions use more CPU),
    • Use any of 9 voices for speech, including male and female voices with a range of speed/quality level, and switch between them on the fly,
    • Change the pitch, speed and variance of any text-to-speech voice,
    • Know whether headphones are plugged in and continue voice recognition during text-to-speech only when they are plugged in,
    • Support bluetooth audio devices (experimental),
    • Dispatch information to any part of your app about the results of speech recognition and speech, or changes in the state of the audio session (such as an incoming phone call or headphones being plugged in),
    • Deliver level metering for both speech input and speech output so you can design visual feedback for both states.
    • Support JSGF grammars,
    • Dynamically generate new ARPA language models in-app based on input from an NSArray of NSStrings,
    • Switch between ARPA language models or JSGF grammars on the fly,
    • Get n-best lists with scoring,
    • Test existing recordings,
    • Be easily interacted with via standard and simple Objective-C methods,
    • Control all audio functions with text-to-speech and speech recognition in memory instead of writing audio files to disk and then reading them,
    • Drive speech recognition with a low-latency Audio Unit driver for highest responsiveness,
    • Be installed in a Cocoa-standard fashion using an easy-peasy already-compiled framework.
    • In addition to its various new features and faster recognition/text-to-speech responsiveness, OpenEars now has improved recognition accuracy.
      • OpenEars is free to use in an iPhone or iPad app.
    Warning
    Before using OpenEars, please note it has to use a different audio driver on the Simulator that is less accurate, so it is always necessary to evaluate accuracy on a real device. Please don't submit support requests for accuracy issues with the Simulator.


    Warning
    Because Apple has removed armv6 architecture compiling in Xcode 4.5, and it is only possible to support upcoming devices using the armv7s architecture available in Xcode 4.5, there was no other option than to end support for armv6 devices after OpenEars 1.2. That means that current version of OpenEars only supports armv7 and armv7s devices (iPhone 3GS and later). If your app supports older devices like the first generation iPhone or the iPhone 3G, you can continue to download the legacy edition of OpenEars 1.2 here, but that edition will not update further – all updated versions of OpenEars starting with 1.2.1 will not support armv6 devices, just armv7 and armv7s. If you have previously been supporting older devices and you want to submit an app update removing that support, you must set your minimum deployment target to iOS 4.3 or later, or your app will be rejected by Apple. The framework is 100% compatible with LLVM-using versions of Xcode which precede version 4.5, but your app must be set to not compile the armv6 architecture in order to use it.

    Installation

    To use OpenEars:

    • Create your own app, and add the iOS frameworks AudioToolbox and AVFoundation to it.
    • Inside your downloaded distribution there is a folder called "Frameworks". Drag the "Frameworks" folder into your app project in Xcode.

    OK, now that you've finished laying the groundwork, you have to...wait, that's everything. You're ready to start using OpenEars. Give the sample app a spin to try out the features (the sample app uses ARC so you'll need a recent Xcode version) and then visit the Politepix interactive tutorial generator for a customized tutorial showing you exactly what code to add to your app for all of the different functionality of OpenEars.

    If the steps on this page didn't work for you, you can get free support at the forums, read the FAQ, brush up on the documentation, or open aprivate email support incident at the Politepix shop. If you'd like to read the documentation, simply read onward.

    Basic concepts

    There are a few basic concepts to understand about voice recognition and OpenEars that will make it easiest to create an app.

    • Local or offline speech recognition versus server-based or online speech recognition: most speech recognition on the iPhone is done by streaming the speech audio to servers. OpenEars works by doing the recognition inside the iPhone without using the network. This saves bandwidth and results in faster response, but since a server is much more powerful than a phone it means that we have to work with much smaller vocabularies to get accurate recognition.
    • Language Models. The language model is the vocabulary that you want OpenEars to understand, in a format that its speech recognition engine can understand. The smaller and better-adapted to your users' real usage cases the language model is, the better the accuracy. An ideal language model for PocketsphinxController has fewer than 200 words.
    • The parts of OpenEars. OpenEars has a simple, flexible and very powerful architecture. PocketsphinxController recognizes speech using a language model that was dynamically created byLanguageModelGeneratorFliteController creates synthesized speech (TTS). And OpenEarsEventsObserver dispatches messages about every feature of OpenEars (what speech was understood by the engine, whether synthesized speech is in progress, if there was an audio interruption) to any part of your app.
    BACK TO TOP

    FliteController Class Reference

    Detailed Description

    The class that controls speech synthesis (TTS) in OpenEars.

    Usage examples

    Preparing to use the class:

    To use FliteController, you need to have at least one Flite voice added to your project. When you added the "framework" folder of OpenEars to your app, you already imported a voice called Slt, so these instructions will use the Slt voice. You can get eight more free voices in OpenEarsExtras, available at https://bitbucket.org/Politepix/openearsextras

    What to add to your header:

    Add the following lines to your header (the .h file). Under the imports at the very top:
    #import <Slt/Slt.h>
    #import <OpenEars/FliteController.h>
    
    In the middle part where instance variables go:
    FliteController *fliteController;
    Slt *slt;
    
    In the bottom part where class properties go:
    @property (strong, nonatomic) FliteController *fliteController;
    @property (strong, nonatomic) Slt *slt;
    

    What to add to your implementation:

    Add the following to your implementation (the .m file):Under the @implementation keyword at the top:
    @synthesize fliteController;
    @synthesize slt;
    
    Among the other methods of the class, add these lazy accessor methods for confident memory management of the object:
    - (FliteController *)fliteController {
    	if (fliteController == nil) {
    		fliteController = [[FliteController alloc] init];
    	}
    	return fliteController;
    }
    
    - (Slt *)slt {
    	if (slt == nil) {
    		slt = [[Slt alloc] init];
    	}
    	return slt;
    }
    

    How to use the class methods:

    In the method where you want to call speech (to test this out, add it to your viewDidLoad method), add the following method call:
    [self.fliteController say:@"A short statement" withVoice:self.slt];
    
    Warning
    There can only be one FliteController instance in your app at any given moment.

    Method Documentation

    - (void) say:   (NSString *)  statement
    withVoice:   (FliteVoice *)  voiceToUse 
           

    This takes an NSString which is the word or phrase you want to say, and the FliteVoice to use to say the phrase. Usage Example:

    [self.fliteController say:@"Say it, don't spray it." withVoice:self.slt];

    There are a total of nine FliteVoices available for use with OpenEars. The Slt voice is the most popular one and it ships with OpenEars. The other eight voices can be downloaded as part of the OpenEarsExtras package available at the URL http://bitbucket.org/Politepix/openearsextras. To use them, just drag the desired downloaded voice's framework into your app, import its header at the top of your calling class (e.g. import <Slt/Slt.h> or import <Rms/Rms.h>) and instantiate it as you would any other object, then passing the instantiated voice to this method.

    - (Float32) fliteOutputLevel      

    A read-only attribute that tells you the volume level of synthesized speech in progress. This is a UI hook. You can't read it on the main thread.

    Property Documentation

    - (float) duration_stretch

    duration_stretch changes the speed of the voice. It is on a scale of 0.0-2.0 where 1.0 is the default.

    - (float) target_mean

    target_mean changes the pitch of the voice. It is on a scale of 0.0-2.0 where 1.0 is the default.

    - (float) target_stddev

    target_stddev changes convolution of the voice. It is on a scale of 0.0-2.0 where 1.0 is the default.

    - (BOOL) userCanInterruptSpeech

    Set userCanInterruptSpeech to TRUE in order to let new incoming human speech cut off synthesized speech in progress.

    BACK TO TOP

    LanguageModelGenerator Class Reference

    Detailed Description

    The class that generates the vocabulary the PocketsphinxController is able to understand.

    Usage examples

    What to add to your implementation:

    Add the following to your implementation (the .m file):Under the @implementation keyword at the top:
    #import <OpenEars/LanguageModelGenerator.h>
    
    Wherever you need to instantiate the language model generator, do it as follows:
    LanguageModelGenerator *lmGenerator = [[LanguageModelGenerator alloc] init];
    

    How to use the class methods:

    In the method where you want to create your language model (for instance your viewDidLoad method), add the following method call (replacing the placeholders like "WORD" and "A PHRASE" with actual words and phrases you want to be able to recognize):
    NSArray *words = [NSArray arrayWithObjects:@"WORD", @"STATEMENT", @"OTHER WORD", @"A PHRASE", nil];
    NSString *name = @"NameIWantForMyLanguageModelFiles";
    NSError *err = [lmGenerator generateLanguageModelFromArray:words withFilesNamed:name];
    
    
    NSDictionary *languageGeneratorResults = nil;
    
    NSString *lmPath = nil;
    NSString *dicPath = nil;
    	
    if([err code] == noErr) {
    	
    	languageGeneratorResults = [err userInfo];
    		
    	lmPath = [languageGeneratorResults objectForKey:@"LMPath"];
    	dicPath = [languageGeneratorResults objectForKey:@"DictionaryPath"];
    		
    } else {
    	NSLog(@"Error: %@",[err localizedDescription]);
    }
    
    If you are using the default English-language model generation, it is a requirement to enter your words and phrases in all capital letters, since the model is generated against a dictionary in which the entries are capitalized (meaning that if the words in the array aren't capitalized, they will not match the dictionary and you will not have the widest variety of pronunciations understood for the word you are using).If you need to create a fixed language model ahead of time instead of creating it dynamically in your app, just use this method (or generateLanguageModelFromTextFile:withFilesNamed:) to submit your full language model using the Simulator and then use the Simulator documents folder script to get the language model and dictionary file out of the documents folder and add it to your app bundle, referencing it from there.

    Method Documentation

    - (NSError *) generateLanguageModelFromArray:   (NSArray *)  languageModelArray
    withFilesNamed:   (NSString *)  fileName 
           

    Generate a language model from an array of NSStrings which are the words and phrases you want PocketsphinxController or PocketsphinxController+RapidEars to understand. Putting a phrase in as a string makes it somewhat more probable that the phrase will be recognized as a phrase when spoken. fileName is the way you want the output files to be named, for instance if you enter "MyDynamicLanguageModel" you will receive files output to your Documents directory titled MyDynamicLanguageModel.dic, MyDynamicLanguageModel.arpa, and MyDynamicLanguageModel.DMP. The error that this method returns contains the paths to the files that were created in a successful generation effort in its userInfo when NSError == noErr. The words and phrases in languageModelArray must be written with capital letters exclusively, for instance "word" must appear in the array as "WORD".

    - (NSError *) generateLanguageModelFromTextFile:   (NSString *)  pathToTextFile
    withFilesNamed:   (NSString *)  fileName 
           

    Generate a language model from a text file containing words and phrases you want PocketsphinxController to understand. The file should be formatted with every word or contiguous phrase on its own line with a line break afterwards. Putting a phrase in on its own line makes it somewhat more probable that the phrase will be recognized as a phrase when spoken. Give the correct full path to the text file as a string. fileName is the way you want the output files to be named, for instance if you enter "MyDynamicLanguageModel" you will receive files output to your Documents directory titled MyDynamicLanguageModel.dic, MyDynamicLanguageModel.arpa, and MyDynamicLanguageModel.DMP. The error that this method returns contains the paths to the files that were created in a successful generation effort in its userInfo when NSError == noErr. The words and phrases in languageModelArray must be written with capital letters exclusively, for instance "word" must appear in the array as "WORD".

    Property Documentation

    - (BOOL) verboseLanguageModelGenerator

    Set this to TRUE to get verbose output

    - (BOOL) useFallbackMethod

    Advanced: turn this off if the words in your input array or text file aren't in English and you are using a custom dictionary file

    - (NSString *) dictionaryPathAsString

    Advanced: if you have your own pronunciation dictionary you want to use instead of CMU07a.dic you can assign its full path to this property before running the language model generation.

    BACK TO TOP

    OpenEarsEventsObserver Class Reference

    Detailed Description

    OpenEarsEventsObserver provides a large set of delegate methods that allow you to receive information about the events in OpenEars from anywhere in your app. You can create as many OpenEarsEventsObservers as you need and receive information using them simultaneously. All of the documentation for the use ofOpenEarsEventsObserver is found in the sectionOpenEarsEventsObserverDelegate.

    Property Documentation

    - (id< OpenEarsEventsObserverDelegate >) delegate

    To use the OpenEarsEventsObserverDelegate methods, assign this delegate to the class hosting OpenEarsEventsObserver and then use the delegate methods documented under OpenEarsEventsObserverDelegate. There is a complete example of how to do this explained under theOpenEarsEventsObserverDelegate documentation.

    BACK TO TOP

    OpenEarsLogging Class Reference

    Detailed Description

    A singleton which turns logging on or off for the entire framework. The type of logging is related to overall framework functionality such as the audio session and timing operations. Please turn OpenEarsLogging on for any issue you encounter. It will probably show the problem, but if not you can show the log on the forum and get help.

    Warning
    The individual classes such as PocketsphinxController andLanguageModelGenerator have their own verbose flags which are separate from OpenEarsLogging.

    Method Documentation

    + (id) startOpenEarsLogging      

    This just turns on logging. If you don't want logging in your session, don't send the startOpenEarsLogging message.

    Example Usage:

    Before implementation:

    #import <OpenEars/OpenEarsLogging.h>;

    In implementation:

    BACK TO TOP

    PocketsphinxController Class Reference

    Detailed Description

    The class that controls local speech recognition in OpenEars.

    Usage examples

    Preparing to use the class:

    To use PocketsphinxController, you need a language model and a phonetic dictionary for it. These files define which words PocketsphinxController is capable of recognizing. They are created above by using LanguageModelGenerator.

    What to add to your header:

    Add the following lines to your header (the .h file). Under the imports at the very top:
    #import <OpenEars/PocketsphinxController.h>
    
    In the middle part where instance variables go:
    PocketsphinxController *pocketsphinxController;
    
    In the bottom part where class properties go:
    @property (strong, nonatomic) PocketsphinxController *pocketsphinxController;
    

    What to add to your implementation:

    Add the following to your implementation (the .m file):Under the @implementation keyword at the top:
    @synthesize pocketsphinxController;
    
    Among the other methods of the class, add this lazy accessor method for confident memory management of the object:
    - (PocketsphinxController *)pocketsphinxController {
    	if (pocketsphinxController == nil) {
    		pocketsphinxController = [[PocketsphinxController alloc] init];
    	}
    	return pocketsphinxController;
    }
    

    How to use the class methods:

    In the method where you want to recognize speech (to test this out, add it to your viewDidLoad method), add the following method call:
    [self.pocketsphinxController startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath languageModelIsJSGF:NO];
    
    Warning
    There can only be one PocketsphinxController instance in your app.

    Method Documentation

    - (void) startListeningWithLanguageModelAtPath:   (NSString *)  languageModelPath
    dictionaryAtPath:   (NSString *)  dictionaryPath
    languageModelIsJSGF:   (BOOL)  languageModelIsJSGF 
           

    Start the speech recognition engine up. You provide the full paths to a language model and a dictionary file which are created usingLanguageModelGenerator.

    - (void) stopListening      

    Shut down the engine. You must do this before releasing a parent view controller that contains PocketsphinxController.

    - (void) suspendRecognition      

    Keep the engine going but stop listening to speech until resumeRecognition is called. Takes effect instantly.

    - (void) resumeRecognition      

    Resume listening for speech after suspendRecognition has been called.

    - (void) changeLanguageModelToFile:   (NSString *)  languageModelPathAsString
    withDictionary:   (NSString *)  dictionaryPathAsString 
           

    Change from one language model to another. This lets you change which words you are listening for depending on the context in your app.

    - (Float32) pocketsphinxInputLevel      

    Gives the volume of the incoming speech. This is a UI hook. You can't read it on the main thread or it will block.

    - (void) runRecognitionOnWavFileAtPath:   (NSString *)  wavPath
    usingLanguageModelAtPath:   (NSString *)  languageModelPath
    dictionaryAtPath:   (NSString *)  dictionaryPath
    languageModelIsJSGF:   (BOOL)  languageModelIsJSGF 
           

    You can use this to run recognition on an already-recorded WAV file for testing. The WAV file has to be 16-bit and 16000 samples per second.

    Property Documentation

    - (float) secondsOfSilenceToDetect

    This is how long PocketsphinxController should wait after speech ends to attempt to recognize speech. This defaults to .7 seconds.

    - (BOOL) returnNbest

    Advanced: set this to TRUE to receive n-best results.

    - (int) nBestNumber

    Advanced: the number of n-best results to return. This is a maximum number to return – if there are null hypotheses fewer than this number will be returned.

    - (int) calibrationTime

    How long to calibrate for. This can only be one of the values '1', '2', or '3'. Defaults to 1.

    - (BOOL) verbosePocketSphinx

    Turn on verbose output. Do this any time you encounter an issue and any time you need to report an issue on the forums.

    - (BOOL) returnNullHypotheses

    By default, PocketsphinxController won't return a hypothesis if for some reason the hypothesis is null (this can happen if the perceived sound was just noise). If you need even empty hypotheses to be returned, you can set this to TRUE before starting PocketsphinxController.

    BACK TO TOP

    <OpenEarsEventsObserverDelegate> Protocol Reference

    Detailed Description

    OpenEarsEventsObserver provides a large set of delegate methods that allow you to receive information about the events in OpenEars from anywhere in your app. You can create as many OpenEarsEventsObservers as you need and receive information using them simultaneously.

    Usage examples

    What to add to your header:

    Add the following lines to your header (the .h file). Under the imports at the very top:
    #import <OpenEars/OpenEarsEventsObserver.h>
    
    at the @interface declaration, add the OpenEarsEventsObserverDelegate inheritance.An example of this for a view controller called ViewController would look like this:
    @interface ViewController : UIViewController <OpenEarsEventsObserverDelegate> {
    
    In the middle part where instance variables go:
    OpenEarsEventsObserver *openEarsEventsObserver;
    
    In the bottom part where class properties go:
    @property (strong, nonatomic) OpenEarsEventsObserver *openEarsEventsObserver;
    

    What to add to your implementation:

    Add the following to your implementation (the .m file):Under the @implementation keyword at the top:
    @synthesize openEarsEventsObserver;
    
    Among the other methods of the class, add this lazy accessor method for confident memory management of the object:
    - (OpenEarsEventsObserver *)openEarsEventsObserver {
    	if (openEarsEventsObserver == nil) {
    		openEarsEventsObserver = [[OpenEarsEventsObserver alloc] init];
    	}
    	return openEarsEventsObserver;
    }
    
    and then right before you start your first OpenEars functionality (for instance, right before your first self.fliteController say:withVoice: message or right before your first self.pocketsphinxController startListeningWithLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF: message) send this message:
    [self.openEarsEventsObserver setDelegate:self];
    

    How to use the class methods:

    Add these delegate methods of OpenEarsEventsObserver to your class:
    - (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID {
    	NSLog(@"The received hypothesis is %@ with a score of %@ and an ID of %@", hypothesis, recognitionScore, utteranceID);
    }
    
    - (void) pocketsphinxDidStartCalibration {
    	NSLog(@"Pocketsphinx calibration has started.");
    }
    
    - (void) pocketsphinxDidCompleteCalibration {
    	NSLog(@"Pocketsphinx calibration is complete.");
    }
    
    - (void) pocketsphinxDidStartListening {
    	NSLog(@"Pocketsphinx is now listening.");
    }
    
    - (void) pocketsphinxDidDetectSpeech {
    	NSLog(@"Pocketsphinx has detected speech.");
    }
    
    - (void) pocketsphinxDidDetectFinishedSpeech {
    	NSLog(@"Pocketsphinx has detected a period of silence, concluding an utterance.");
    }
    
    - (void) pocketsphinxDidStopListening {
    	NSLog(@"Pocketsphinx has stopped listening.");
    }
    
    - (void) pocketsphinxDidSuspendRecognition {
    	NSLog(@"Pocketsphinx has suspended recognition.");
    }
    
    - (void) pocketsphinxDidResumeRecognition {
    	NSLog(@"Pocketsphinx has resumed recognition."); 
    }
    
    - (void) pocketsphinxDidChangeLanguageModelToFile:(NSString *)newLanguageModelPathAsString andDictionary:(NSString *)newDictionaryPathAsString {
    	NSLog(@"Pocketsphinx is now using the following language model: \n%@ and the following dictionary: %@",newLanguageModelPathAsString,newDictionaryPathAsString);
    }
    
    - (void) pocketSphinxContinuousSetupDidFail { // This can let you know that something went wrong with the recognition loop startup. Turn on OPENEARSLOGGING to learn why.
    	NSLog(@"Setting up the continuous recognition loop has failed for some reason, please turn on OpenEarsLogging to learn more.");
    }
    

    Method Documentation

    - (void) audioSessionInterruptionDidBegin      

    There was an interruption.

    - (void) audioSessionInterruptionDidEnd      

    The interruption ended.

    - (void) audioInputDidBecomeUnavailable      

    The input became unavailable.

    - (void) audioInputDidBecomeAvailable      

    The input became available again.

    - (void) audioRouteDidChangeToRoute:   (NSString *)  newRoute  

    The audio route changed.

    - (void) pocketsphinxDidStartCalibration      

    Pocketsphinx isn't listening yet but it started calibration.

    - (void) pocketsphinxDidCompleteCalibration      

    Pocketsphinx isn't listening yet but calibration completed.

    - (void) pocketsphinxRecognitionLoopDidStart      

    Pocketsphinx isn't listening yet but it has entered the main recognition loop.

    - (void) pocketsphinxDidStartListening      

    Pocketsphinx is now listening.

    - (void) pocketsphinxDidDetectSpeech      

    Pocketsphinx heard speech and is about to process it.

    - (void) pocketsphinxDidDetectFinishedSpeech      

    Pocketsphinx detected a second of silence indicating the end of an utterance

    - (void) pocketsphinxDidReceiveHypothesis:   (NSString *)  hypothesis
    recognitionScore:   (NSString *)  recognitionScore
    utteranceID:   (NSString *)  utteranceID 
           

    Pocketsphinx has a hypothesis.

    - (void) pocketsphinxDidReceiveNBestHypothesisArray:   (NSArray *)  hypothesisArray  

    Pocketsphinx has an n-best hypothesis dictionary.

    - (void) pocketsphinxDidStopListening      

    Pocketsphinx has exited the continuous listening loop.

    - (void) pocketsphinxDidSuspendRecognition      

    Pocketsphinx has not exited the continuous listening loop but it will not attempt recognition.

    - (void) pocketsphinxDidResumeRecognition      

    Pocketsphinx has not existed the continuous listening loop and it will now start attempting recognition again.

    - (void) pocketsphinxDidChangeLanguageModelToFile:   (NSString *)  newLanguageModelPathAsString
    andDictionary:   (NSString *)  newDictionaryPathAsString 
           

    Pocketsphinx switched language models inline.

    - (void) pocketSphinxContinuousSetupDidFail      

    Some aspect of setting up the continuous loop failed, turn onOpenEarsLogging for more info.

    - (void) fliteDidStartSpeaking      

    Flite started speaking. You probably don't have to do anything about this.

    - (void) fliteDidFinishSpeaking      

    Flite finished speaking. You probably don't have to do anything about this.


    展开全文
  • <div><p>A few of us (mostly old members I guess) are becoming a bit tired of the welcome slides and talk. <p>The aim of this issue is to start a discussion on how we could reduce the time it takes to ...
  • DeepSpeech on PaddlePaddle is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with PaddlePaddle platform. Our vision is to empower both industrial application ...
  • DeepSpeech on PaddlePaddle is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, with PaddlePaddle platform. Our vision is to empower both industrial application ...
  • 170712 python_speech_features

    千次阅读 2017-07-12 12:08:00
    Welcome to python_speech_features’s documentation! Audio tools for Linux commandline geeks Code:from python_speech_features import mfcc from python_speech_features import logfbank import scipy.io....

    Welcome to python_speech_features’s documentation!
    Audio tools for Linux commandline geeks
    Managing Linguistic Data
    Timit
    快速克隆网站(Teleport Ultra)
    这里写图片描述

    Code:

    from python_speech_features import mfcc
    from python_speech_features import logfbank
    import scipy.io.wavfile as wav
    
    (rate,sig) = wav.read("file.wav")
    mfcc_feat = mfcc(sig,rate)
    fbank_feat = logfbank(sig,rate)
    
    print(fbank_feat[1:3,:])

    mfcc_feat, fbank_feat are 2D matrics
    mfcc_feat: the number of row stands for the the frame number, each frame(帧) covers a sub-band of signal
    mfcc_feat: the number of columns stands for the filters numbers

    展开全文
  • <ul><li>Dialog title discarded when finding text in browse mode: https://github.com/nvaccess/nvda/issues/11144</li><li>Causes system tests to fail, the welcome dialog is not read as expected....
  • <div><p>It seems like the default way of specifying animals ... Would this be a welcome change? I can look into it if I find some spare time.</p><p>该提问来源于开源项目:sckott/cowsay</p></div>
  • <p>We also welcome your requests for this particular directions and contributions! <h2>Progress & Plan #2124 <ul><li>[x] A new task for speech enhancement</li><li>[x] Speech enhancement recipe ...
  • Welcome email feature

    2020-12-09 02:30:35
    :speech_balloon: Chime in below in a comment to cooperate with others who are also working on this task. <p>:hand: Report back at the above issue once you succeed, to encourage others and share what...
  • <div><p>Issue for -variation, who will fill out the details for the community on what is involved. The basics, though: ... Thoughts welcome.</p><p>该提问来源于开源项目:cltk/cltk</p></div>
  • Now that snapshot version also include OneCore voices I think anyone using speech would benefit from being able to change synthesizer and voice rate. When changing to a new synthesizer the standard ...
  • I have used the default <code>LJSpeech</code> for traing and as we can see, the alignment has been quick to be learnt in 7K steps and the quantity of the dataset was less than 20 hours (less than 10K...
  • <p>Now we play the welcome audio when login is done, so that causes tts and speech might be unavailable when the welcome did play.</p><p>该提问来源于开源项目:yodaos-project/yoda.js</p></div>
  • ...)</li><li>removes the speech bubble</li><li>changes the copy to not reference Lockie</li></ul> <p>Here's the <strong>before</strong> state: <p><img width="1347" alt="screen shot 2017-11-08 at ...
  • <div><p>From a conversation in the user e-mail list, the idea was put forward to have an interactive tutorial for new users from the welcome screen after you install NVDA. The tutorial would work ...
  • From: Download Politepix’s OpenEars OpenEars is an shared-source iOS framework for iPhone voice recognition and TTS. It lets you implement round-trip English language speech recognition and text-to

    From:   Download Politepix’s OpenEars

    OpenEars is an shared-source iOS framework for iPhone voice recognition and TTS. It lets you implement round-trip English language speech recognition and text-to-speech on the iPhone and iPad and uses the open source CMU Pocketsphinx, CMU Flite, and CMUCLMTK libraries. Highly-accurate large-vocabulary recognition (that is, trying to recognize any word the user speaks out of many thousands of known words) is not yet a reality for local in-app processing on the iPhone given the hardware limitations of the platform; even Siri does its large-vocabulary recognition on the server side. However, Pocketsphinx (the open source voice recognition engine that OpenEars uses) is capable of local recognition on the iPhone of vocabularies with hundreds of words depending on the environment and other factors, and performs very well with command-and-control language models. The best part is that it uses no network connectivity — all processing occurs locally on the device.

    Table of Contents:


    The current version of the OpenEars iPhone speech recognition API is 1.1.

    OpenEars can:

    • Listen continuously for speech on a background thread, while suspending or resuming speech processing on demand, all while using less than 8% CPU on average on a first-generation iPhone (decoding speech, text-to-speech, updating the UI and other intermittent functions use more CPU),
    • Use any of 9 voices for speech, including male and female voices with a range of speed/quality level, and switch between them on the fly,
    • Change the pitch, speed and variance of any text-to-speech voice,
    • Know whether headphones are plugged in and continue voice recognition during text-to-speech only when they are plugged in,
    • Support bluetooth audio devices (experimental),
    • Dispatch information to any part of your app about the results of speech recognition and speech, or changes in the state of the audio session (such as an incoming phone call or headphones being plugged in),
    • Deliver level metering for both speech input and speech output so you can design visual feedback for both states.
    • Support JSGF grammars,
    • Dynamically generate new ARPA language models in-app based on input from an NSArray of NSStrings,
    • Switch between ARPA language models or JSGF grammars on the fly,
    • Get n-best lists with scoring,
    • Test existing recordings,
    • Be easily interacted with via standard and simple Objective-C methods,
    • Control all audio functions with text-to-speech and speech recognition in memory instead of writing audio files to disk and then reading them,
    • Drive speech recognition with a low-latency Audio Unit driver for highest responsiveness,
    • Be installed in a Cocoa-standard fashion using an easy-peasy already-compiled framework.

    In addition to its various new features and faster recognition/text-to-speech responsiveness, OpenEars now has improved recognition accuracy.

    Before using OpenEars, please note that its low-latency Audio Unit driver is not compatible with the Simulator, so it has a fallback Audio Queue driver for the Simulator provided as a convenience so you can debug recognition logic. This means is that recognition is better on the device, and that I’d appreciate it if bug reports are limited to issues which affect the device.

    To use OpenEars:

    1. Download the distribution and unpack it.

    2. Create your own app, and add the iOS frameworks AudioToolbox and AVFoundation to it.

    3. Inside your downloaded distribution there is a folder called “frameworks” that is inside the folder called “OpenEars”. Drag the “frameworks” folder into your app project in Xcode.

    OK, now that you’ve finished laying the groundwork, you have to…wait, that’s everything. You’re ready to start using OpenEars.

    Before shipping your app, you will want to remove unused voices from it so that the app size won’t be too big, as explained here.

    If the steps on this page didn’t work for you, you can get free support at the forums, read the FAQ, or open a private email support incident at the Politepix shop. Otherwise, carry on to the next part: using OpenEars in your app.

    OpenEars uses the open source speech recognition engine Pocketsphinx from Carnegie Mellon University:
    展开全文
  • [题目要求]A foreign delegation is to visit your university. You are assigned to make a welcome speech on behalf of your... Now write a welcome speech to 1)express your welcome, and 2)make a brief i...

    [题目要求]
    A foreign delegation is to visit your university. You are assigned to make a welcome speech on behalf of your class. Now write a welcome speech to
      1)express your welcome, and
      2)make a brief introduction to your university.
      You should write about 100 words on ANSER SHEET 2. Do not sign your own name at the end of the letter. Use “Li Ming” instead. You do not need to write the address. (10 points)

    [参考范文]A Welcome Speech一封欢迎词(130 words)
    Ladies and Gentlemen,

    First of all, please allow me to express the most heartfelt welcome to all of you on behalf of our Class One in the Computer Science Department of Tsinghua University. We have been looking forward to seeing you for long. It is a wonderful day today.

    Now I would like to brief my university to you since I want to leave the most wonderful for you to discover. Tsinghua University is well-known both at home and abroad. If you want to meet distinguished scholars, please come to Tsinghua. If you want to meet the most industrious students, please come to Tsinghua. If you want to discover the most attractive campus, please come to Tsinghua. I do hope that you will enjoy your stay in Tsinghua.  
                            Sincerely Yours

                                Li Ming

    转载于:https://www.cnblogs.com/OceanChen/archive/2009/03/18/1415285.html

    展开全文
  • 国科大英语小作业

    2020-03-15 21:22:39
    1. 如果你是你们学校演讲...Hello everyone, welcome to the speech club. Thank you for taking your precious spare time from your busy study life to join our speech club. Our club holds an exchange meeti...
  • Welcome message one reprompt or ask for help for more options. What would you like?" } ], "WelcomeBackMessage": [ "Welcome back!.", "Good to see you again." ], "...
  • Not working in device

    2020-12-08 20:54:37
    <div><p>Good Morning when I run the code listed (excerpt) in the emulator it ...Welcome', voice: 'en-GB' }); Thanks</p><p>该提问来源于开源项目:naoufal/react-native-speech</p></div>
  • 讲解中级口译必备词组 常考翻译句型1.我非常感谢... Reference:Thank you very much for... ...Reference:gracious speech of welcome 3...之一 Reference:be one of 4.访问...是... Reference:A visit to...has...
  • Trigger Word Detection ...Welcome to the final programming assignment of this specialization! In this week's videos, you learned about applying deep learning to speech recognition. In this a...
  • Expected OutputTrigger Word ...Welcome to the final programming assignment of this specialization! In this week's videos, you learned about applying deep learning to speech recognition. In this...
  • Trigger word detection - v1 参考答案

    千次阅读 2018-03-04 21:44:05
    Welcome to the final programming assignment of this specialization! In this week’s videos, you learned about applying deep learning to speech recognition. In this assignment,...
  • 常用句子1

    2012-02-21 22:38:57
    1.我非常感谢... Reference:Thank you very much ...Reference:gracious speech of welcome 3...之一 Reference:be one of 4.访问...是... Reference:A visit to...has... 5.多年梦寐以求的愿望 Reference:has
  • 转载自吴恩达老师深度学习课程作业notebook ...Welcome to the final programming assignment of this specialization! In this week’s videos, you learned about applying deep learning to speech recognitio...
  • python语音识别Welcome to The Complete Beginner’s Guide to Speech Recognition in Python. 欢迎使用Python语音识别完整入门指南。 In this post, I will walk you through some great hands-on exercises that ...

空空如也

空空如也

1 2 3 4 5 ... 7
收藏数 139
精华内容 55
关键字:

speechwelcome