How to pipe multiple readable streams, from multiple api requests, to a single writeable stream?

前端未结

关注

 4  1652

- Desired Behaviour
- Actual Behaviour
- What I\'ve Tried
- Steps To Reproduce
- Research

相关标签:

4条回答

执笔经年

2021-02-07 11:24

Here are two solutions.

Solution 01

uses Bluebird.mapSeries

writes individual responses to temporary files

puts them in a zip file (using archiver)

sends zip file back to client to save

deletes temporary files

It utilises Bluebird.mapSeries from BM's answer but instead of just mapping over the responses, requests and responses are handled within the map function. Also, it resolves promises on the writeable stream finish event, rather than the readable stream end event. Bluebird is helpful in that it pauses iteration within a map function until a response has been received and handled, and then moves on to the next iteration.

Given that the Bluebird map function produces clean audio files, rather than zipping the files, you could use a solution like in Terry Lennox's answer to combine multiple audio files into one audio file. My first attempt of that solution, using Bluebird and fluent-ffmpeg, produced a single file, but it was slightly lower quality - no doubt this could be tweaked in ffmpeg settings, but i didn't have time to do that.

// route handler app.route("/api/:api_version/tts") .get(api_tts_get); // route handler middleware const api_tts_get = async (req, res) => { var query_parameters = req.query; var file_name = query_parameters.file_name; var text_string_array = text_string_array; // eg: https://pastebin.com/raw/JkK8ehwV var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name); var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root // set up archiver var archive = archiver('zip', { zlib: { level: 9 } // sets the compression level }); var zip_write_stream = fs.createWriteStream(`${relative_path}.zip`); archive.pipe(zip_write_stream); await Bluebird.mapSeries(text_chunk_array, async function(text_chunk, index) { // check if last value of array const isLastIndex = index === text_chunk_array.length - 1; return new Promise((resolve, reject) => { var textToSpeech = new TextToSpeechV1({ iam_apikey: iam_apikey, url: tts_service_url }); var synthesizeParams = { text: text_chunk, accept: 'audio/mp3', voice: 'en-US_AllisonV3Voice' }; textToSpeech.synthesize(synthesizeParams, (err, audio) => { if (err) { console.log("synthesize - an error occurred: "); return reject(err); } // write individual files to disk var file_name = `${relative_path}_${index}.mp3`; var write_stream = fs.createWriteStream(`${file_name}`); audio.pipe(write_stream); // on finish event of individual file write write_stream.on('finish', function() { // add file to archive archive.file(file_name, { name: `audio_${index}.mp3` }); // if not the last value of the array if (isLastIndex === false) { resolve(); } // if the last value of the array else if (isLastIndex === true) { resolve(); // when zip file has finished writing, // send it back to client, and delete temp files from server zip_write_stream.on('close', function() { // download the zip file (using absolute_path) res.download(`${absolute_path}.zip`, (err) => { if (err) { console.log(err); } // delete each audio file (using relative_path) for (let i = 0; i < text_chunk_array.length; i++) { fs.unlink(`${relative_path}_${i}.mp3`, (err) => { if (err) { console.log(err); } console.log(`AUDIO FILE ${i} REMOVED!`); }); } // delete the zip file fs.unlink(`${relative_path}.zip`, (err) => { if (err) { console.log(err); } console.log(`ZIP FILE REMOVED!`); }); }); }); // from archiver readme examples archive.on('warning', function(err) { if (err.code === 'ENOENT') { // log warning } else { // throw error throw err; } }); // from archiver readme examples archive.on('error', function(err) { throw err; }); // from archiver readme examples archive.finalize(); } }); }); }); }); }

Solution 02

I was keen to find a solution that didn't use a library to "pause" within the map() iteration, so I:

swapped the map() function for a for of loop

used await before the api call, rather than wrapping it in a promise, and

instead of using return new Promise() to contain the response handling, I used await new Promise() (gleaned from this answer)

This last change, magically, paused the loop until the archive.file() and audio.pipe(writestream) operations were completed - i'd like to better understand how that works.

// route handler app.route("/api/:api_version/tts") .get(api_tts_get); // route handler middleware const api_tts_get = async (req, res) => { var query_parameters = req.query; var file_name = query_parameters.file_name; var text_string_array = text_string_array; // eg: https://pastebin.com/raw/JkK8ehwV var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name); var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root // set up archiver var archive = archiver('zip', { zlib: { level: 9 } // sets the compression level }); var zip_write_stream = fs.createWriteStream(`${relative_path}.zip`); archive.pipe(zip_write_stream); for (const [index, text_chunk] of text_chunk_array.entries()) { // check if last value of array const isLastIndex = index === text_chunk_array.length - 1; var textToSpeech = new TextToSpeechV1({ iam_apikey: iam_apikey, url: tts_service_url }); var synthesizeParams = { text: text_chunk, accept: 'audio/mp3', voice: 'en-US_AllisonV3Voice' }; try { var audio_readable_stream = await textToSpeech.synthesize(synthesizeParams); await new Promise(function(resolve, reject) { // write individual files to disk var file_name = `${relative_path}_${index}.mp3`; var write_stream = fs.createWriteStream(`${file_name}`); audio_readable_stream.pipe(write_stream); // on finish event of individual file write write_stream.on('finish', function() { // add file to archive archive.file(file_name, { name: `audio_${index}.mp3` }); // if not the last value of the array if (isLastIndex === false) { resolve(); } // if the last value of the array else if (isLastIndex === true) { resolve(); // when zip file has finished writing, // send it back to client, and delete temp files from server zip_write_stream.on('close', function() { // download the zip file (using absolute_path) res.download(`${absolute_path}.zip`, (err) => { if (err) { console.log(err); } // delete each audio file (using relative_path) for (let i = 0; i < text_chunk_array.length; i++) { fs.unlink(`${relative_path}_${i}.mp3`, (err) => { if (err) { console.log(err); } console.log(`AUDIO FILE ${i} REMOVED!`); }); } // delete the zip file fs.unlink(`${relative_path}.zip`, (err) => { if (err) { console.log(err); } console.log(`ZIP FILE REMOVED!`); }); }); }); // from archiver readme examples archive.on('warning', function(err) { if (err.code === 'ENOENT') { // log warning } else { // throw error throw err; } }); // from archiver readme examples archive.on('error', function(err) { throw err; }); // from archiver readme examples archive.finalize(); } }); }); } catch (err) { console.log("oh dear, there was an error: "); console.log(err); } } }

Learning Experiences

Other issues that came up during this process are documented below:

Long requests time out when using node (and resend the request)...

// solution req.connection.setTimeout( 1000 * 60 * 10 ); // ten minutes

See: https://github.com/expressjs/express/issues/2512

400 errors caused by node max header size of 8KB (query string is included in header size)...

// solution (although probably not recommended - better to get text_string_array from server, rather than client) node --max-http-header-size 80000 app.js

See: https://github.com/nodejs/node/issues/24692

0 讨论(0)

发布评论:

提交评论

加载中...

半阙折子戏

2021-02-07 11:25

WebRTC would be good option for above problem. Because your once your file has generation done , i will give client to listen.

https://www.npmjs.com/package/simple-peer

0 讨论(0)

发布评论:

提交评论

加载中...

遥遥无期

2021-02-07 11:35

The core problem to solve here is asynchronicity. You almost had it: the problem with the code you posted is that you are piping all source streams in parallel & unordered into the target stream. This means data chunks will flow randomly from different audio streams - even your end event will outrace the pipes without end closing the target stream too early, which might explain why it increases after you re-open it.

What you want is to pipe them sequentially - you even posted the solution when you quoted

You want to add the second read into an eventlistener for the first read to finish...

or as code:

a.pipe(c, { end:false }); a.on('end', function() { b.pipe(c); }

This will pipe the source streams in sequential order into the target stream.

Taking your code this would mean to replace the audio_files.forEach loop with:

await Bluebird.mapSeries(audio_files, async (audio, index) => { const isLastIndex = index == audio_files_length - 1; audio.pipe(write_stream, { end: isLastIndex }); return new Promise(resolve => audio.on('end', resolve)); });

Note the usage of bluebird.js mapSeries here.

Further advice regarding your code:

you should consider using lodash.js

you should use const & let instead of var and consider using camelCase

when you notice "it works with one event, but fails with multiple" always think: asynchronicity, permutations, race conditions.

Further reading, limitations of combining native node streams: https://github.com/nodejs/node/issues/93

0 讨论(0)

发布评论:

提交评论

加载中...

轮回少年

2021-02-07 11:45

I'll give my two cents here, since I looked at a similar question recently! From what I have tested, and researched, you can combine the two .mp3 / .wav streams into one. This results in a file that has noticable issues as you've mentioned such as truncation, glitches etc.

The only way I believe you can combine the Audio streams correctly will be with a module that is designed to concatenate sound files/data.

The best result I have obtained is to synthesize the audio into separate files, then combine like so:

function combineMp3Files(files, outputFile) { const ffmpeg = require("fluent-ffmpeg"); const combiner = ffmpeg().on("error", err => { console.error("An error occurred: " + err.message); }) .on("end", () => { console.log('Merge complete'); }); // Add in each .mp3 file. files.forEach(file => { combiner.input(file) }); combiner.mergeToFile(outputFile); }

This uses the node-fluent-ffmpeg library, which requires installing ffmpeg.

Other than that I'd suggest you ask IBM support (because as you say the docs don't seem to indicate this) how API callers should combine the synthesized audio, since your use case will be very common.

To create the text files, I do the following:

// Switching to audio/webm and the V3 voices.. much better output function synthesizeText(text) { const synthesizeParams = { text: text, accept: 'audio/webm', voice: 'en-US_LisaV3Voice' }; return textToSpeech.synthesize(synthesizeParams); } async function synthesizeTextChunksSeparateFiles(text_chunks) { const audioArray = await Promise.all(text_chunks.map(synthesizeText)); console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`); audioArray.forEach((audio, index) => { audio.pipe(fs.createWriteStream(`audio-${index}.mp3`)); }); }

And then combine like so:

combineMp3Files(['audio-0.mp3', 'audio-1.mp3', 'audio-2.mp3', 'audio-3.mp3', 'audio-4.mp3'], 'combined.mp3');

I should point out that I'm doing this in two separate steps (waiting a few hundred milliseconds would also work), but it should be easy enough to wait for the individual files to be written, then combine them.

Here's a function that will do this:

async function synthesizeTextChunksThenCombine(text_chunks, outputFile) { const audioArray = await Promise.all(text_chunks.map(synthesizeText)); console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`); let writePromises = audioArray.map((audio, index) => { return new Promise((resolve, reject) => { audio.pipe(fs.createWriteStream(`audio-${index}.mp3`).on('close', () => { resolve(`audio-${index}.mp3`); })); }) }); let files = await Promise.all(writePromises); console.log('synthesizeTextChunksThenCombine: Separate files: ', files); combineMp3Files(files, outputFile); }

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复