Monitoring access and visits to our Node-powered websites is a good way to improve our knowledge of the inner functioning of MongoDB text search. The problem is with analysing the collected data.
Suppose that we're collecting data in a specific MongoDB collection by saving the URL of the page and the current time of the hit.
First, we can create a text index on our collection.
db.visits.createIndex({ url: "text"})
We are interested in specific keywords that may occur within a page URL. We need a function to perform a text search on our collection.
Then we also need to calculate the percentage of positive searches with respect to the total number of documents.
'use strict';
const mongoose = require('mongoose');
mongoose.connect('mongodb://localhost:27017/data', {useNewUrlParser: true});
const visitSchema = new mongoose.Schema({
url: String,
date: Date
});
const visits = mongoose.model('visits', visitSchema);
const percentage = (x, y) => {
return Math.floor((parseInt(x, 10) / parseInt(y, 10)) * 100);
};
const getTopicsStats = async keyword => {
try {
let total = await stats.countDocuments();
let search = await stats.countDocuments({ $text: { $search: keyword }});
return {
topic: keyword,
found: search,
percentage: percentage(search, total)
}
} catch(err) {
console.log(err);
}
};
Keywords/topics are usually part of our URLs. In this case we're counting how many hits we can find with a single search over the entire collection.
Our dedicated function makes use of async/await in order to benefit from the Promise-like behavior of the Mongoose's collection methods.
Since an asynchronous function generates a new Promise, we can take advantage of such a feature to perform batch searches.
const getAllTopicsStats = () => {
let keywords = ['css', 'javascript', 'jquery', 'php', 'wordpress', 'nodejs'];
let stats = [];
for(let i = 0; i < keywords.length; i++) {
stats.push(getTopicsStats(keywords[i]));
}
return Promise.all(stats);
};
Finally, you can get all your data with a single routine.
const sortStats = stats => {
let sorted = stats.sort((a, b) => b.found - a.found;);
return sorted;
};
const displayTopicsStats = async () => {
try {
let stats = await getAllTopicsStats();
console.log(sortStats(stats));
}catch(err) {
console.log(err);
}
};
If you want to extend these features a little further, you can try to collect also the User-Agent string from the corresponding HTTP header. Then you can use a new text index on the newly added field to get information about the various browsers used by your visitors.