IPC-Stelle: G10L 21/049 [Version 2017.01]

SymbolTypTitel
GSKPHYSICS
G10KLMUSICAL INSTRUMENTSACOUSTICS
G10LUKLSPEECH ANALYSIS OR SYNTHESISSPEECH RECOGNITIONSPEECH OR VOICE PROCESSINGSPEECH OR AUDIO CODING OR DECODING [4]
G10L 13/00HGRSpeech synthesisText to speech systems [7, 2006.01]
G10L 13/02UGR1
.Methods for producing synthetic speechSpeech synthesisers [7, 2006.01, 2013.01]
G10L 13/027UGR2
. .Concept to speech synthesisersGeneration of natural phrases from machine-based concepts (generation of parameters for speech synthesis out of text G10L 13/08) [2013.01]
G10L 13/033UGR2
. .Voice editing, e.g. manipulating the voice of the synthesiser [2013.01]
G10L 13/04UGR2
. .Details of speech synthesis systems, e.g. synthesiser structure or memory management [7, 2006.01, 2013.01]
G10L 13/047UGR3
. . .Architecture of speech synthesisers [2013.01]
G10L 13/06UGR1
.Elementary speech units used in speech synthesisersConcatenation rules [7, 2006.01, 2013.01]
G10L 13/07UGR2
. .Concatenation rules [2013.01]
G10L 13/08UGR1
.Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination [7, 2006.01, 2013.01]
G10L 13/10UGR2
. .Prosody rules derived from textStress or intonation [2013.01]
G10L 15/00HGRSpeech recognition (G10L 17/00 takes precedence) [7, 2006.01, 2013.01]
G10L 15/01UGR1
.Assessment or evaluation of speech recognition systems [2013.01]
G10L 15/02UGR1
.Feature extraction for speech recognitionSelection of recognition unit [7, 2006.01]
G10L 15/04UGR1
.SegmentationWord boundary detection [7, 2006.01, 2013.01]
G10L 15/05UGR2
. .Word boundary detection [2013.01]
G10L 15/06UGR1
.Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice (G10L 15/14 takes precedence) [7, 2006.01, 2013.01]
G10L 15/065UGR2
. .Adaptation [2013.01]
G10L 15/07UGR3
. . .to the speaker [2013.01]
G10L 15/08UGR1
.Speech classification or search [7, 2006.01]
G10L 15/10UGR2
. .using distance or distortion measures between unknown speech and reference templates [7, 2006.01]
G10L 15/12UGR2
. .using dynamic programming techniques, e.g. dynamic time warping [DTW] [7, 2006.01]
G10L 15/14UGR2
. .using statistical models, e.g. Hidden Markov Models [HMM] (G10L 15/18 takes precedence) [7, 2006.01]
G10L 15/16UGR2
. .using artificial neural networks [7, 2006.01]
G10L 15/18UGR2
. .using natural language modelling [7, 2006.01, 2013.01]
G10L 15/183UGR3
. . .using context dependencies, e.g. language models [2013.01]
G10L 15/187UGR4
. . . .Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams [2013.01]
G10L 15/19UGR4
. . . .Grammatical context, e.g. disambiguation of recognition hypotheses based on word sequence rules [2013.01]
G10L 15/193UGR5
. . . . .Formal grammars, e.g. finite state automata, context free grammars or word networks [2013.01]
G10L 15/197UGR5
. . . . .Probabilistic grammars, e.g. word n-grams [2013.01]
G10L 15/20UGR1
.Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech (G10L 21/02 takes precedence) [7, 2006.01]
G10L 15/22UGR1
.Procedures used during a speech recognition process, e.g. man-machine dialog [7, 2006.01]
G10L 15/24UGR1
.Speech recognition using non-acoustical features [7, 2006.01, 2013.01]
G10L 15/25UGR2
. .using position of the lips, movement of the lips or face analysis [2013.01]
G10L 15/26UGR1
.Speech to text systems (G10L 15/08 takes precedence) [7, 2006.01]
G10L 15/28UGR1
.Constructional details of speech recognition systems [7, 2006.01, 2013.01]
G10L 15/30UGR2
. .Distributed recognition, e.g. in client-server systems, for mobile phones or network applications [2013.01]
G10L 15/32UGR2
. .Multiple recognisers used in sequence or in parallelScore combination systems therefor, e.g. voting systems [2013.01]
G10L 15/34UGR2
. .Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing [2013.01]
G10L 17/00HGRSpeaker identification or verification [7, 2006.01, 2013.01]
G10L 17/02UGR1
.Preprocessing operations, e.g. segment selectionPattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal componentsFeature selection or extraction [2013.01]
G10L 17/04UGR1
.Training, enrolment or model building [2013.01]
G10L 17/06UGR1
.Decision making techniquesPattern matching strategies [2013.01]
G10L 17/08UGR2
. .Use of distortion metrics or a particular distance between probe pattern and reference templates [2013.01]
G10L 17/10UGR2
. .Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems [2013.01]
G10L 17/12UGR2
. .Score normalisation [2013.01]
G10L 17/14UGR2
. .Use of phonemic categorisation or speech recognition prior to speaker recognition or verification [2013.01]
G10L 17/16UGR1
.Hidden Markov models [HMMs] [2013.01]
G10L 17/18UGR1
.Artificial neural networksConnectionist approaches [2013.01]
G10L 17/20UGR1
.Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions [2013.01]
G10L 17/22UGR1
.Interactive proceduresMan-machine interfaces [2013.01]
G10L 17/24UGR2
. . the user being prompted to utter a password or a predefined phrase [2013.01]
G10L 17/26UGR1
.Recognition of special voice characteristics, e.g. for use in lie detectorsRecognition of animal voices [2013.01]
G10L 19/00HGRSpeech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis (in musical instruments G10H) [7, 2006.01, 2013.01]
G10L 19/002UGR1
.Dynamic bit allocation (for perceptual audio coders G10L 19/032) [2013.01]
G10L 19/005UGR1
.Correction of errors induced by the transmission channel, if related to the coding algorithm [2013.01]
G10L 19/008UGR1
.Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing [2013.01]
G10L 19/012UGR1
.Comfort noise or silence coding [2013.01]
G10L 19/018UGR1
.Audio watermarking, i.e. embedding inaudible data in the audio signal [2013.01]
G10L 19/02UGR1
.using spectral analysis, e.g. transform vocoders or subband vocoders [7, 2006.01, 2013.01]
G10L 19/022UGR2
. .Blocking, i.e. grouping of samples in timeChoice of analysis windowsOverlap factoring [2013.01]
G10L 19/025UGR3
. . .Detection of transients or attacks for time/frequency resolution switching [2013.01]
G10L 19/028UGR2
. .Noise substitution, e.g. substituting non-tonal spectral components by noisy source (comfort noise for discontinuous speech transmission G10L 19/012) [2013.01]
G10L 19/03UGR2
. .Spectral prediction for preventing pre-echoTemporary noise shaping [TNS], e.g. in MPEG2 or MPEG4 [2013.01]
G10L 19/032UGR2
. .Quantisation or dequantisation of spectral components [2013.01]
G10L 19/035UGR3
. . .Scalar quantisation [2013.01]
G10L 19/038UGR3
. . .Vector quantisation, e.g. TwinVQ audio [2013.01]
G10L 19/04UGR1
.using predictive techniques [7, 2006.01, 2013.01]
G10L 19/06UGR2
. .Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients [7, 2006.01, 2013.01]
G10L 19/07UGR3
. . .Line spectrum pair [LSP] vocoders [2013.01]
G10L 19/08UGR2
. .Determination or coding of the excitation functionDetermination or coding of the long-term prediction parameters [7, 2006.01, 2013.01]
G10L 19/083UGR3
. . .the excitation function being an excitation gain (G10L 25/90 takes precedence) [2013.01]
G10L 19/087UGR3
. . .using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC [2013.01]
G10L 19/09UGR3
. . .Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor [2013.01]
G10L 19/093UGR3
. . .using sinusoidal excitation models [2013.01]
G10L 19/097UGR3
. . .using prototype waveform decomposition or prototype waveform interpolative [PWI] coders [2013.01]
G10L 19/10UGR3
. . .the excitation function being a multipulse excitation [7, 2006.01, 2013.01]
G10L 19/107UGR4
. . . .Sparse pulse excitation, e.g. by using algebraic codebook [2013.01]
G10L 19/113UGR4
. . . .Regular pulse excitation [2013.01]
G10L 19/12UGR3
. . .the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders [7, 2006.01, 2013.01]
G10L 19/125UGR4
. . . .Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP]  [2013.01]
G10L 19/13UGR4
. . . .Residual excited linear prediction [RELP] [2013.01]
G10L 19/135UGR4
. . . .Vector sum excited linear prediction [VSELP] [2013.01]
G10L 19/16UGR2
. .Vocoder architecture [2013.01]
G10L 19/18UGR3
. . .Vocoders using multiple modes [2013.01]
G10L 19/20UGR4
. . . .using sound class specific coding, hybrid encoders or object based coding [2013.01]
G10L 19/22UGR4
. . . .Mode decision, i.e. based on audio signal content versus external parameters [2013.01]
G10L 19/24UGR4
. . . .Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding [2013.01]
G10L 19/26UGR2
. .Pre-filtering or post-filtering [2013.01]
G10L 21/00HGRProcessing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility (G10L 19/00 takes precedence) [7, 2006.01, 2013.01]
G10L 21/003UGR1
.Changing voice quality, e.g. pitch or formants [2013.01]
G10L 21/007UGR2
. .characterised by the process used [2013.01]
G10L 21/01UGR3
. . .Correction of time axis [2013.01]
G10L 21/013UGR3
. . .Adapting to target pitch [2013.01]
G10L 21/02UGR1
.Speech enhancement, e.g. noise reduction or echo cancellation (reducing echo effects in line transmission systems H04B 3/20; echo suppression in hands-free telephones H04M 9/08) [7, 2006.01, 2013.01]
G10L 21/0208UGR2
. .Noise filtering [2013.01]
G10L 21/0216UGR3
. . .characterised by the method used for estimating noise [2013.01]
G10L 21/0224UGR4
. . . .Processing in the time domain [2013.01]
G10L 21/0232UGR4
. . . .Processing in the frequency domain [2013.01]
G10L 21/0264UGR3
. . .characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques [2013.01]
G10L 21/0272UGR2
. .Voice signal separating [2013.01]
G10L 21/028UGR3
. . .using properties of sound source [2013.01]
G10L 21/0308UGR3
. . .characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques [2013.01]
G10L 21/0316UGR2
. .by changing the amplitude [2013.01]
G10L 21/0324UGR3
. . .Details of processing therefor [2013.01]
G10L 21/0332UGR4
. . . .involving modification of waveforms [2013.01]
G10L 21/034UGR4
. . . .Automatic adjustment [2013.01]
G10L 21/0356UGR3
. . .for synchronising with other signals, e.g. video signals [2013.01]
G10L 21/0364UGR3
. . .for improving intelligibility [2013.01]
G10L 21/038UGR2
. .using band spreading techniques [2013.01]
G10L 21/0388UGR3
. . .Details of processing therefor [2013.01]
G10L 21/04UGR1
.Time compression or expansion [7, 2006.01, 2013.01]
G10L 21/043UGR2
. .by changing speed [2013.01]
G10L 21/045UGR3
. . .using thinning out or insertion of a waveform [2013.01]
G10L 21/047UGR4
. . . .characterised by the type of waveform to be thinned out or inserted [2013.01]
G10L 21/049UGR4
. . . .characterised by the interconnection of waveforms [2013.01]
G10L 21/055UGR2
. .for synchronising with other signals, e.g. video signals [2013.01]
G10L 21/057UGR2
. .for improving intelligibility [2013.01]
G10L 21/06UGR1
.Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids (G10L 15/26 takes precedence) [7, 2006.01, 2013.01]
G10L 21/10UGR2
. .Transforming into visible information [2013.01]
G10L 21/12UGR3
. . .by displaying time domain information [2013.01]
G10L 21/14UGR3
. . .by displaying frequency domain information [2013.01]
G10L 21/16UGR2
. .Transforming into a non-visible representation (devices or methods enabling ear patients to replace direct auditory perception by another kind of perception A61F 11/04) [2013.01]
G10L 21/18UGR2
. .Details of the transformation process [2013.01]
G10L 25/00HGRSpeech or voice analysis techniques not restricted to a single one of groups G10L 15/00-G10L 21/00 (muting semiconductor-based amplifiers when some special characteristics of a signal are sensed by a speech detector, e.g. sensing when no signal is present, H03G 3/34) [2013.01]
G10L 25/03UGR1
.characterised by the type of extracted parameters [2013.01]
G10L 25/06UGR2
. .the extracted parameters being correlation coefficients [2013.01]
G10L 25/09UGR2
. .the extracted parameters being zero crossing rates [2013.01]
G10L 25/12UGR2
. .the extracted parameters being prediction coefficients [2013.01]
G10L 25/15UGR2
. .the extracted parameters being formant information [2013.01]
G10L 25/18UGR2
. .the extracted parameters being spectral information of each sub-band [2013.01]
G10L 25/21UGR2
. .the extracted parameters being power information [2013.01]
G10L 25/24UGR2
. .the extracted parameters being the cepstrum [2013.01]
G10L 25/27UGR1
.characterised by the analysis technique [2013.01]
G10L 25/30UGR2
. .using neural networks [2013.01]
G10L 25/33UGR2
. .using fuzzy logic [2013.01]
G10L 25/36UGR2
. .using chaos theory [2013.01]
G10L 25/39UGR2
. .using genetic algorithms [2013.01]
G10L 25/45UGR1
.characterised by the type of analysis window [2013.01]
G10L 25/48UGR1
.specially adapted for particular use [2013.01]
G10L 25/51UGR2
. .for comparison or discrimination [2013.01]
G10L 25/54UGR3
. . .for retrieval [2013.01]
G10L 25/57UGR3
. . .for processing of video signals [2013.01]
G10L 25/60UGR3
. . .for measuring the quality of voice signals [2013.01]
G10L 25/63UGR3
. . .for estimating an emotional state [2013.01]
G10L 25/66UGR3
. . .for extracting parameters related to health condition (detecting or measuring for diagnostic purposes A61B 5/00) [2013.01]
G10L 25/69UGR2
. .for evaluating synthetic or decoded voice signals [2013.01]
G10L 25/72UGR2
. .for transmitting results of analysis [2013.01]
G10L 25/75UGR1
.for modelling vocal tract parameters [2013.01]
G10L 25/78UGR1
.Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M 9/10) [2013.01]
G10L 25/81UGR2
. .for discriminating voice from music [2013.01]
G10L 25/84UGR2
. .for discriminating voice from noise [2013.01]
G10L 25/87UGR2
. .Detection of discrete points within a voice signal [2013.01]
G10L 25/90UGR1
.Pitch determination of speech signals [2013.01]
G10L 25/93UGR1
.Discriminating between voiced and unvoiced parts of speech signals (G10L 25/90 takes precedence) [2013.01]
G10L 99/00HGRSubject matter not provided for in other groups of this subclass [2013.01]