Bonjour,

J'essaye d'utiliser la sortie de voix azure pour l'injecter dans une conversation téléphonique avec Ozeki.

  1. Enregistrement de la ligne -> Ok
  2. Réception de l'appel externe -> Ok
  3. Génération du TTS -> Ok
  4. Connection du stream azure à ozeki -> Je sais pas comment faire


Mes contraintes:
  • Je ne dois pas utiliser des fichiers sur le disque.
  • Le temps de latence doit être le plus faible possible
  • Tant que l'appel n'est pas raccroché par le client, continué la lecture TTS au fur et à mesure
  • .Net Framework 4.5


Quelqu'un aurait une piste ?

Coté TTS:

Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
public static async Task<byte[]> SynthesisToPushAudioOutputStreamAsync()
        {
            var conf = SpeechConfig.FromSubscription(az_key, az_reg);
            conf.SpeechRecognitionLanguage = az_lang;
            conf.SpeechRecognitionLanguage = az_voice;
 
            // Prepare ssml from text input
            //var ssml = $@"<speak version='1.0' xml:lang='fr-FR' xmlns='http://www.w3.org/2001/10/synthesis' xmlns:emo='http://www.w3.org/2009/10/emotionml'  xmlns:mstts='http://www.w3.org/2001/mstts'><voice name='{az_voice}'><s /><mstts:express-as style='cheerful'>{text}</mstts:express-as><s /></voice ></speak > ";
 
            // Creates an instance of a customer class inherited
            var callback = new PushAudioOutputStreamSampleCallback();
 
            // Creates an audio out stream from the callback.
            using (var stream = AudioOutputStream.CreatePushStream(callback))
            {
                // Creates a speech synthesizer using audio stream output.
                using (var streamConfig = AudioConfig.FromStreamOutput(stream))
                using (var synthesizer = new SpeechSynthesizer(conf, streamConfig))
                {
                    while (true)
                    {
                        // Receives a text from console input and synthesize it to push audio output stream.
                        Console.WriteLine("Enter some text that you want to synthesize, or enter empty text to exit.");
                        Console.Write("> ");
                        string text = Console.ReadLine();
                        if (string.IsNullOrEmpty(text))
                        {
                            break;
                        }
 
                        using (var result = await synthesizer.SpeakTextAsync(text))
                        {
                            if (result.Reason == ResultReason.SynthesizingAudioCompleted)
                            {
                                Console.WriteLine($"Speech synthesized for text [{text}], and the audio was written to output stream. first byte latency: {callback.GetLatency()}");
                            }
                            else if (result.Reason == ResultReason.Canceled)
                            {
                                var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
                                Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");
 
                                if (cancellation.Reason == CancellationReason.Error)
                                {
                                    Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                                    Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                                    Console.WriteLine($"CANCELED: Did you update the subscription info?");
                                }
                            }
                        }
                    }
                }
 
                Console.WriteLine($"Totally {callback.GetAudioData().Length} bytes received.");
                return callback.GetAudioData();
            }
        }
Coté gestion de l'appel, bofff:
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
 
 
        static MediaConnector connector;
        static PhoneCallAudioSender mediaSender;
        static PhoneCallAudioReceiver mediaReceiver;
        static Speaker speaker;
        static Microphone microphone;
        static RawStreamPlayback rawStreamPlayback;
 
 
 
        public static async void IncomingCall(object sender, VoIPEventArgs<IPhoneCall> e)
        {
            Log.info("Incoming call from: " + e.Item.DialInfo.ToString());
            var call = e.Item;
            call.CallStateChanged += StateChanged;
            call.Answer();
 
 
            // Récupération du flux audio
            /*var callback = new PushAudioOutputStreamSampleCallback();
            await TTS.SynthesisToPushAudioOutputStreamAsync();
            
            var data = callback.GetAudioData();
            playback = new AudioStreamPlayback(data);*/
            speaker = Speaker.GetDefaultDevice();
 
            // Create a new RawStreamPlayback object
            RawStreamPlayback rawStreamPlayback = new RawStreamPlayback();
 
            // Attach the audio output device (e.g. speakers) to the RawStreamPlayback object
            rawStreamPlayback.AttachToAudioOutputDevice();
 
            // Open a file stream to read the audio data
            FileStream fileStream = new FileStream("audio.pcm", FileMode.Open, FileAccess.Read);
 
            // Start playing the audio data from the file stream
            rawStreamPlayback.Start(fileStream);
 
 
            mediaSender = new PhoneCallAudioSender();
            mediaReceiver = new PhoneCallAudioReceiver();
 
            connector = new MediaConnector();
 
            connector.Connect(microphone, mediaSender);
            connector.Connect(mediaReceiver, speaker);
 
            mediaSender.AttachToCall(call);
            mediaReceiver.AttachToCall(call);
 
            microphone.Start();
            speaker.Start();
 
            while (true) Thread.Sleep(10);
        }
 
 
 
        public static void StateChanged(object sender, CallStateChangedArgs e)
        {
            Log.info("Call state " + e.State.ToString());
        }
Merci d'avance pour les aventureux !