What is claimed is:
1. A method of translating speech(発話) from a first language to a second language, the method comprising: recognizing speech by a speaker; identifying the speech by the speaker as being in the first language; initiating(開始) a translation of the speech in the first language, by a speech translation system, into the second language; recognizing, by the speech translation system, one or more prosodic(韻律的) cues in the speech in the first language, one or more of the prosodic cues("the one or more of prosodic cues"でも同じ?) being of a specific type of prosodic cue; responsive to(に応じて) recognizing the prosodic cues, producing a back-channel cue corresponding to the specific type of prosodic cue; providing, by the speech translation system, the produced back-channel cue to the speaker, the back-channel cue comprising(から成る、構成される) an audible confirmation that initiation of the translation of the speech in the first language has occurred; and determining a translation result in the second language.
2. The method of claim 1, wherein the produced back-channel cue further confirms("comprising the ...cue further confirming ..."でもOK?) that the translation of speech in the first language is currently working(動作中、進行中) and uninterrupted.
3. The method of claim 1, wherein the recognized one or more prosodic cues comprises(単数) a pause in the speech by the speaker, the produced back-channel cue confirming that the translation of speech is in progress.
4. The method of claim 3, wherein the recognizing by the speech translation system of the one or more prosodic cues comprising a pause in the speech by the speaker adjusts sensitivity for detection of a break point beginning the pause dependent on(依存して) a speech setting for the speech by the speaker.
5. The method of claim 4, wherein the speech setting is adjustable based on input provided by the speaker.
6. The method of claim 1, wherein the one or more prosodic cuesare selected from the group consisting of pauses, pitch contours, or intensity changes(選択肢のor).
(追記2016/1/23 これは単なる誤記かも?one of A, B, or Cならまだしも、the group consisting of A, B, or Cはないんじゃないだろうか?)
7. The method of claim 1, wherein the prosodic cues are selected from the group consisting of pauses and pitch contours(選択肢のand).
8. A speech translation system, comprising: a processor; a speech recognition module that identifies sound comprising speech spoken in a first language by a speaker; a prosodic module that recognizes prosodic cues in the speech in the first language, one or more of the prosodic cues being of a specific type of prosodic cue; a speech synthesis module that produces, responsive to recognizing the prosodic cues, a back-channel cue corresponding to the specific type of prosodic cue and provides the produced back-channel cue to the speaker, the back-channel cue comprising an audible confirmation that initiation of the translation of the speech in the first language has occurred; and a translation module that translates and outputs(複数動詞の並列), in a second language, the speech spoken in the first language by the speaker.
(日本人なら、原稿が「第1の言語での発話を翻訳し第2の言語で出力するモジュール」の場合、"module that translates the speech spoken in the first language and outputs the speech in a second language"とすると思う),
9. A computer program product comprising a non-transitory computer readable storage medium(非一時的コンピュータ読取可能媒体) having instructions encoded thereon(符号化された命令を有し) that, when executed by a processor, cause the processor to(プロセッサにより実行): recognize speech by a speaker; identify the speech by the speaker as being in the first language; initiate a translation of the speech in the first language, by a speech translation system, into the second language; recognize, by the speech translation system, one or more prosodic cues in the speech in the first language, one or more of the prosodic cues being of a specific type of prosodic cue; responsive to recognizing the prosodic cues, produce a back-channel cue corresponding to the specific type of prosodic cue; provide, by the speech translation system, the produced back-channel cue to the speaker, the back-channel cue comprising an audible confirmation that initiation of the translation of the speech in the first language has occurred; and determine a translation result in the second language.
12. The computer program product of claim 11, wherein the recognizing(動名詞) by the speech translation system of the one or more prosodic cues comprising a pause in the speech by the speaker adjusts sensitivity for detection of a break point beginning the pause dependent on a speech setting for the speech by the speaker.
19. The method of claim 14, wherein the one or more prosodic cues are selected from the group consisting of pauses, pitch contours, or and intensity changes.
20. The method of claim 17, wherein the prosodic cues are selected from the group consisting of pauses and pitch contours.
21. A computer program product(コンピュータプログラムプロダクト) comprising a non-transitory computer readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to: recognizing(recognizeの誤記) speech by a speaker; identifying the speech by the speaker as being in a first language; initiating a translation of the speech in the first language, by a speech translation system, into a second language; recognizing, by the speech translation system, one or more prosodic cues in the speech in the first language, one or more of the prosodic cues being of a specific type of prosodic cue; responsive to recognizing the prosodic cues, producing a back-channel cue corresponding to the specific type of prosodic cues; providing, responsive to recognizing a back-channel cue to the speaker, the back-channel cue comprising an audible confirmation that the speech translation system is ready to receive additional speech for translation; determining a translation result in the second language. (US9070363)