DuckDuckGo Search JS
Scraping search results from DuckDuckGo's Lite page using Node.js is a great way to gather data on a specific topic or group of keywords. In this blog post, we will walk through the process of scraping search results from DuckDuckGo's Lite page using the Node.js fetch and JSDOM libraries.
First, install JSDOM by running the following command in your terminal:
1npm install jsdom
We will then use the native fetch library to send a GET request to DuckDuckGo's Lite page, passing in the search query as a parameter. The response from the server will be passed to the JSDOM, which allows us to navigate and extract data from the HTML using the DOM syntax.
1import { JSDOM } from "jsdom";
2
3/**
4 * It fetches the HTML of the DuckDuckGo search page, parses it, and returns the results
5 * @param query - The query to search for
6 * @returns An array of objects with the following properties:
7 * title: The title of the result
8 * description: The description of the result
9 * url: The url of the result
10 */
11export async function search(query) {
12 const html = await cache(
13 `https://lite.duckduckgo.com/lite/?q=${encodeURIComponent(query)}`,
14 "text"
15 );
16 let doc = new JSDOM(html);
17 let document = doc.window.document;
18 let sponsored = [
19 ...document.querySelectorAll("tr[class='result-sponsored']"),
20 ].pop();
21 let trs = [...document.querySelectorAll("tr")];
22 let rawRes = [...chunks(trs.slice(trs.indexOf(sponsored) + 1), 4)];
23
24 let results = [];
25 for (let i = 0; i < rawRes.length; i++) {
26 const group = rawRes[i];
27 if (group.length == 4) {
28 results.push({
29 title: group[1].querySelector("a").textContent,
30 description: group[2].querySelector("td[class='result-snippet']")
31 .textContent,
32 url:
33 "http://" +
34 group[3].querySelector("span[class='link-text']").textContent,
35 });
36 }
37 }
38 return results;
39}
This uses the chunking function below:
1function* chunks(arr, n) {
2 for (let i = 0; i < arr.length; i += n) {
3 yield arr.slice(i, i + n);
4 }
5}