G | SK | PHYSICS |
G10 | KL | MUSICAL INSTRUMENTS; ACOUSTICS |
G10L | UKL | SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING [4] |
G10L 13/00 | HGR | Speech synthesis; Text to speech systems [7, 2006.01] |
G10L 13/02 | UGR1 | . | Methods for producing synthetic speech; Speech synthesisers [7, 2006.01, 2013.01] |
|
G10L 13/027 | UGR2 | . . | Concept to speech synthesisers; Generation of natural phrases from machine-based concepts (generation of parameters for speech synthesis out of text G10L 13/08) [2013.01] |
|
G10L 13/033 | UGR2 | . . | Voice editing, e.g. manipulating the voice of the synthesiser [2013.01] |
|
G10L 13/04 | UGR2 | . . | Details of speech synthesis systems, e.g. synthesiser structure or memory management [7, 2006.01, 2013.01] |
|
G10L 13/047 | UGR3 | . . . | Architecture of speech synthesisers [2013.01] |
|
G10L 13/06 | UGR1 | . | Elementary speech units used in speech synthesisers; Concatenation rules [7, 2006.01, 2013.01] |
|
G10L 13/07 | UGR2 | . . | Concatenation rules [2013.01] |
|
G10L 13/08 | UGR1 | . | Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination [7, 2006.01, 2013.01] |
|
G10L 13/10 | UGR2 | . . | Prosody rules derived from text; Stress or intonation [2013.01] |
|
G10L 15/00 | HGR | Speech recognition (G10L 17/00 takes precedence) [7, 2006.01, 2013.01] |
G10L 15/01 | UGR1 | . | Assessment or evaluation of speech recognition systems [2013.01] |
|
G10L 15/02 | UGR1 | . | Feature extraction for speech recognition; Selection of recognition unit [7, 2006.01] |
|
G10L 15/04 | UGR1 | . | Segmentation; Word boundary detection [7, 2006.01, 2013.01] |
|
G10L 15/05 | UGR2 | . . | Word boundary detection [2013.01] |
|
G10L 15/06 | UGR1 | . | Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice (G10L 15/14 takes precedence) [7, 2006.01, 2013.01] |
|
G10L 15/065 | UGR2 | |
G10L 15/07 | UGR3 | . . . | to the speaker [2013.01] |
|
G10L 15/08 | UGR1 | . | Speech classification or search [7, 2006.01] |
|
G10L 15/10 | UGR2 | . . | using distance or distortion measures between unknown speech and reference templates [7, 2006.01] |
|
G10L 15/12 | UGR2 | . . | using dynamic programming techniques, e.g. dynamic time warping [DTW] [7, 2006.01] |
|
G10L 15/14 | UGR2 | . . | using statistical models, e.g. Hidden Markov Models [HMM] (G10L 15/18 takes precedence) [7, 2006.01] |
|
G10L 15/16 | UGR2 | . . | using artificial neural networks [7, 2006.01] |
|
G10L 15/18 | UGR2 | . . | using natural language modelling [7, 2006.01, 2013.01] |
|
G10L 15/183 | UGR3 | . . . | using context dependencies, e.g. language models [2013.01] |
|
G10L 15/187 | UGR4 | . . . . | Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams [2013.01] |
|
G10L 15/19 | UGR4 | . . . . | Grammatical context, e.g. disambiguation of recognition hypotheses based on word sequence rules [2013.01] |
|
G10L 15/193 | UGR5 | . . . . . | Formal grammars, e.g. finite state automata, context free grammars or word networks [2013.01] |
|
G10L 15/197 | UGR5 | . . . . . | Probabilistic grammars, e.g. word n-grams [2013.01] |
|
G10L 15/20 | UGR1 | . | Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech (G10L 21/02 takes precedence) [7, 2006.01] |
|
G10L 15/22 | UGR1 | . | Procedures used during a speech recognition process, e.g. man-machine dialog [7, 2006.01] |
|
G10L 15/24 | UGR1 | . | Speech recognition using non-acoustical features [7, 2006.01, 2013.01] |
|
G10L 15/25 | UGR2 | . . | using position of the lips, movement of the lips or face analysis [2013.01] |
|
G10L 15/26 | UGR1 | . | Speech to text systems (G10L 15/08 takes precedence) [7, 2006.01] |
|
G10L 15/28 | UGR1 | . | Constructional details of speech recognition systems [7, 2006.01, 2013.01] |
|
G10L 15/30 | UGR2 | . . | Distributed recognition, e.g. in client-server systems, for mobile phones or network applications [2013.01] |
|
G10L 15/32 | UGR2 | . . | Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems [2013.01] |
|
G10L 15/34 | UGR2 | . . | Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing [2013.01] |
|
G10L 17/00 | HGR | Speaker identification or verification [7, 2006.01, 2013.01] |
G10L 17/02 | UGR1 | . | Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction [2013.01] |
|
G10L 17/04 | UGR1 | . | Training, enrolment or model building [2013.01] |
|
G10L 17/06 | UGR1 | . | Decision making techniques; Pattern matching strategies [2013.01] |
|
G10L 17/08 | UGR2 | . . | Use of distortion metrics or a particular distance between probe pattern and reference templates [2013.01] |
|
G10L 17/10 | UGR2 | . . | Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems [2013.01] |
|
G10L 17/12 | UGR2 | . . | Score normalisation [2013.01] |
|
G10L 17/14 | UGR2 | . . | Use of phonemic categorisation or speech recognition prior to speaker recognition or verification [2013.01] |
|
G10L 17/16 | UGR1 | . | Hidden Markov models [HMMs] [2013.01] |
|
G10L 17/18 | UGR1 | . | Artificial neural networks; Connectionist approaches [2013.01] |
|
G10L 17/20 | UGR1 | . | Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions [2013.01] |
|
G10L 17/22 | UGR1 | . | Interactive procedures; Man-machine interfaces [2013.01] |
|
G10L 17/24 | UGR2 | . . | the user being prompted to utter a password or a predefined phrase [2013.01] |
|
G10L 17/26 | UGR1 | . | Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices [2013.01] |
|
G10L 19/00 | HGR | Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis (in musical instruments G10H) [7, 2006.01, 2013.01] |
G10L 19/002 | UGR1 | . | Dynamic bit allocation (for perceptual audio coders G10L 19/032) [2013.01] |
|
G10L 19/005 | UGR1 | . | Correction of errors induced by the transmission channel, if related to the coding algorithm [2013.01] |
|
G10L 19/008 | UGR1 | . | Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing [2013.01] |
|
G10L 19/012 | UGR1 | . | Comfort noise or silence coding [2013.01] |
|
G10L 19/018 | UGR1 | . | Audio watermarking, i.e. embedding inaudible data in the audio signal [2013.01] |
|
G10L 19/02 | UGR1 | . | using spectral analysis, e.g. transform vocoders or subband vocoders [7, 2006.01, 2013.01] |
|
G10L 19/022 | UGR2 | . . | Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring [2013.01] |
|
G10L 19/025 | UGR3 | . . . | Detection of transients or attacks for time/frequency resolution switching [2013.01] |
|
G10L 19/028 | UGR2 | . . | Noise substitution, e.g. substituting non-tonal spectral components by noisy source (comfort noise for discontinuous speech transmission G10L 19/012) [2013.01] |
|
G10L 19/03 | UGR2 | . . | Spectral prediction for preventing pre-echo; Temporary noise shaping [TNS], e.g. in MPEG2 or MPEG4 [2013.01] |
|
G10L 19/032 | UGR2 | . . | Quantisation or dequantisation of spectral components [2013.01] |
|
G10L 19/035 | UGR3 | . . . | Scalar quantisation [2013.01] |
|
G10L 19/038 | UGR3 | . . . | Vector quantisation, e.g. TwinVQ audio [2013.01] |
|
G10L 19/04 | UGR1 | . | using predictive techniques [7, 2006.01, 2013.01] |
|
G10L 19/06 | UGR2 | . . | Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients [7, 2006.01, 2013.01] |
|
G10L 19/07 | UGR3 | . . . | Line spectrum pair [LSP] vocoders [2013.01] |
|
G10L 19/08 | UGR2 | . . | Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters [7, 2006.01, 2013.01] |
|
G10L 19/083 | UGR3 | . . . | the excitation function being an excitation gain (G10L 25/90 takes precedence) [2013.01] |
|
G10L 19/087 | UGR3 | . . . | using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC [2013.01] |
|
G10L 19/09 | UGR3 | . . . | Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor [2013.01] |
|
G10L 19/093 | UGR3 | . . . | using sinusoidal excitation models [2013.01] |
|
G10L 19/097 | UGR3 | . . . | using prototype waveform decomposition or prototype waveform interpolative [PWI] coders [2013.01] |
|
G10L 19/10 | UGR3 | . . . | the excitation function being a multipulse excitation [7, 2006.01, 2013.01] |
|
G10L 19/107 | UGR4 | . . . . | Sparse pulse excitation, e.g. by using algebraic codebook [2013.01] |
|
G10L 19/113 | UGR4 | . . . . | Regular pulse excitation [2013.01] |
|
G10L 19/12 | UGR3 | . . . | the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders [7, 2006.01, 2013.01] |
|
G10L 19/125 | UGR4 | . . . . | Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP] [2013.01] |
|
G10L 19/13 | UGR4 | . . . . | Residual excited linear prediction [RELP] [2013.01] |
|
G10L 19/135 | UGR4 | . . . . | Vector sum excited linear prediction [VSELP] [2013.01] |
|
G10L 19/16 | UGR2 | . . | Vocoder architecture [2013.01] |
|
G10L 19/18 | UGR3 | . . . | Vocoders using multiple modes [2013.01] |
|
G10L 19/20 | UGR4 | . . . . | using sound class specific coding, hybrid encoders or object based coding [2013.01] |
|
G10L 19/22 | UGR4 | . . . . | Mode decision, i.e. based on audio signal content versus external parameters [2013.01] |
|
G10L 19/24 | UGR4 | . . . . | Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding [2013.01] |
|
G10L 19/26 | UGR2 | . . | Pre-filtering or post-filtering [2013.01] |
|
G10L 21/00 | HGR | Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility (G10L 19/00 takes precedence) [7, 2006.01, 2013.01] |
G10L 21/003 | UGR1 | . | Changing voice quality, e.g. pitch or formants [2013.01] |
|
G10L 21/007 | UGR2 | . . | characterised by the process used [2013.01] |
|
G10L 21/01 | UGR3 | . . . | Correction of time axis [2013.01] |
|
G10L 21/013 | UGR3 | . . . | Adapting to target pitch [2013.01] |
|
G10L 21/02 | UGR1 | . | Speech enhancement, e.g. noise reduction or echo cancellation (reducing echo effects in line transmission systems H04B 3/20; echo suppression in hands-free telephones H04M 9/08) [7, 2006.01, 2013.01] |
|
G10L 21/0208 | UGR2 | . . | Noise filtering [2013.01] |
|
G10L 21/0216 | UGR3 | . . . | characterised by the method used for estimating noise [2013.01] |
|
G10L 21/0224 | UGR4 | . . . . | Processing in the time domain [2013.01] |
|
G10L 21/0232 | UGR4 | . . . . | Processing in the frequency domain [2013.01] |
|
G10L 21/0264 | UGR3 | . . . | characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques [2013.01] |
|
G10L 21/0272 | UGR2 | . . | Voice signal separating [2013.01] |
|
G10L 21/028 | UGR3 | . . . | using properties of sound source [2013.01] |
|
G10L 21/0308 | UGR3 | . . . | characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques [2013.01] |
|
G10L 21/0316 | UGR2 | . . | by changing the amplitude [2013.01] |
|
G10L 21/0324 | UGR3 | . . . | Details of processing therefor [2013.01] |
|
G10L 21/0332 | UGR4 | . . . . | involving modification of waveforms [2013.01] |
|
G10L 21/034 | UGR4 | . . . . | Automatic adjustment [2013.01] |
|
G10L 21/0356 | UGR3 | . . . | for synchronising with other signals, e.g. video signals [2013.01] |
|
G10L 21/0364 | UGR3 | . . . | for improving intelligibility [2013.01] |
|
G10L 21/038 | UGR2 | . . | using band spreading techniques [2013.01] |
|
G10L 21/0388 | UGR3 | . . . | Details of processing therefor [2013.01] |
|
G10L 21/04 | UGR1 | . | Time compression or expansion [7, 2006.01, 2013.01] |
|
G10L 21/043 | UGR2 | . . | by changing speed [2013.01] |
|
G10L 21/045 | UGR3 | . . . | using thinning out or insertion of a waveform [2013.01] |
|
G10L 21/047 | UGR4 | . . . . | characterised by the type of waveform to be thinned out or inserted [2013.01] |
|
G10L 21/049 | UGR4 | . . . . | characterised by the interconnection of waveforms [2013.01] |
|
G10L 21/055 | UGR2 | . . | for synchronising with other signals, e.g. video signals [2013.01] |
|
G10L 21/057 | UGR2 | . . | for improving intelligibility [2013.01] |
|
G10L 21/06 | UGR1 | . | Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids (G10L 15/26 takes precedence) [7, 2006.01, 2013.01] |
|
G10L 21/10 | UGR2 | . . | Transforming into visible information [2013.01] |
|
G10L 21/12 | UGR3 | . . . | by displaying time domain information [2013.01] |
|
G10L 21/14 | UGR3 | . . . | by displaying frequency domain information [2013.01] |
|
G10L 21/16 | UGR2 | . . | Transforming into a non-visible representation (devices or methods enabling ear patients to replace direct auditory perception by another kind of perception A61F 11/04) [2013.01] |
|
G10L 21/18 | UGR2 | . . | Details of the transformation process [2013.01] |
|
G10L 25/00 | HGR | Speech or voice analysis techniques not restricted to a single one of groups G10L 15/00-G10L 21/00 (muting semiconductor-based amplifiers when some special characteristics of a signal are sensed by a speech detector, e.g. sensing when no signal is present, H03G 3/34) [2013.01] |
G10L 25/03 | UGR1 | . | characterised by the type of extracted parameters [2013.01] |
|
G10L 25/06 | UGR2 | . . | the extracted parameters being correlation coefficients [2013.01] |
|
G10L 25/09 | UGR2 | . . | the extracted parameters being zero crossing rates [2013.01] |
|
G10L 25/12 | UGR2 | . . | the extracted parameters being prediction coefficients [2013.01] |
|
G10L 25/15 | UGR2 | . . | the extracted parameters being formant information [2013.01] |
|
G10L 25/18 | UGR2 | . . | the extracted parameters being spectral information of each sub-band [2013.01] |
|
G10L 25/21 | UGR2 | . . | the extracted parameters being power information [2013.01] |
|
G10L 25/24 | UGR2 | . . | the extracted parameters being the cepstrum [2013.01] |
|
G10L 25/27 | UGR1 | . | characterised by the analysis technique [2013.01] |
|
G10L 25/30 | UGR2 | . . | using neural networks [2013.01] |
|
G10L 25/33 | UGR2 | . . | using fuzzy logic [2013.01] |
|
G10L 25/36 | UGR2 | . . | using chaos theory [2013.01] |
|
G10L 25/39 | UGR2 | . . | using genetic algorithms [2013.01] |
|
G10L 25/45 | UGR1 | . | characterised by the type of analysis window [2013.01] |
|
G10L 25/48 | UGR1 | . | specially adapted for particular use [2013.01] |
|
G10L 25/51 | UGR2 | . . | for comparison or discrimination [2013.01] |
|
G10L 25/54 | UGR3 | . . . | for retrieval [2013.01] |
|
G10L 25/57 | UGR3 | . . . | for processing of video signals [2013.01] |
|
G10L 25/60 | UGR3 | . . . | for measuring the quality of voice signals [2013.01] |
|
G10L 25/63 | UGR3 | . . . | for estimating an emotional state [2013.01] |
|
G10L 25/66 | UGR3 | . . . | for extracting parameters related to health condition (detecting or measuring for diagnostic purposes A61B 5/00) [2013.01] |
|
G10L 25/69 | UGR2 | . . | for evaluating synthetic or decoded voice signals [2013.01] |
|
G10L 25/72 | UGR2 | . . | for transmitting results of analysis [2013.01] |
|
G10L 25/75 | UGR1 | . | for modelling vocal tract parameters [2013.01] |
|
G10L 25/78 | UGR1 | . | Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M 9/10) [2013.01] |
|
G10L 25/81 | UGR2 | . . | for discriminating voice from music [2013.01] |
|
G10L 25/84 | UGR2 | . . | for discriminating voice from noise [2013.01] |
|
G10L 25/87 | UGR2 | . . | Detection of discrete points within a voice signal [2013.01] |
|
G10L 25/90 | UGR1 | . | Pitch determination of speech signals [2013.01] |
|
G10L 25/93 | UGR1 | . | Discriminating between voiced and unvoiced parts of speech signals (G10L 25/90 takes precedence) [2013.01] |
|
G10L 99/00 | HGR | Subject matter not provided for in other groups of this subclass [2013.01] |