Node.js: multiple file downloads with Promises

In this tutorial we will see how to perform multiple downloads in Node.js with Promises.

Let's say you have a web page containing a series of HTML links referencing MP4 video files. As a first step we need to install the cheerio module to be able to extract the URLs from these links.

npm install cheerio --save

Then we define a function that downloads the remote web page.

'use strict';

const https = require('https');
const http = require('http');
const fs = require('fs');
const cheerio = require('cheerio');

const getPage = pageURL => {
    return new Promise((resolve, reject) => {
        https.get(pageURL, res => {
            let data = [];
            res.on('data', chunk => {
                data.push(chunk);
            });
            res.on('end', () => {
                resolve(Buffer.concat(data).toString());
            });    
        });
    });
};

The page download occurs by receiving a data buffer that will be converted into a single string only at the end of the data transfer. At this point we must define the procedure for extracting the links from the page.

const getFileURLS = async url => {
    const urls = [];
    try {
        const html = await getPage(url);
        const $ = cheerio.load(html);
        $('a[download]').each(function() {
            let href = $(this).attr('href');
            urls.push(href);
        });
        return urls;
    } catch(err) {
        return urls;
    }
};

The urls array will contain all the URLs of the MP4 videos and is populated by looping with cheerio on all a elements having the download attribute and adding the value of their href attribute to the array.

Now we need to define the function that will download a single video.

const download = (url, destPath) => {
    return new Promise((resolve, reject) => {
        http.get(url, res => {
            const filePath = fs.createWriteStream(destPath);
            res.pipe(filePath);
            resolve(true); 
        });    
    });
};

The video data contained as a byte buffer will be redirected to the stream created with the createWriteStream() method via the pipe() method of the response object returned by the HTTP request. In this way the performance is optimized compared to the default write methods of the core fs module.

Now we just have to create an array of Promises containing all the HTTP requests for the videos.

const createDownloadRequests = urls => {
    const requests = [];
    for(const url of urls) {
        let urlObj = new URL(url);
        let parts = urlObj.pathname.split('/');
        let filename = parts[parts.length - 1];
        requests.push(download(url, `./video/${filename}`));
    }
    return requests;
};

Finally, we can use the Promise.all() method (we can also use Promise.allSettled() if we accept that some downloads may fail) to perform all downloads.

(async () => {
    const baseURL = 'https://see.stanford.edu/course/cs107';
    try {
        const urls = await getFileURLS(baseURL);
        const requests = createDownloadRequests(urls);
        await Promise.all(requests);

    } catch(err) {
        console.log(err);
    }
})();

In conclusion, the combination of Promises and streaming brings significant performance benefits to an IO-bound operation such as file download.

Back to top