Scrape OpenGraph Metadata with Node.js and OpenGraph.io: A Practical Guide

Post Views: 1,968

Learn how to extract OpenGraph tags and metadata from websites using Node.js and the OpenGraph.io API. Includes examples for building social preview tools and link unfurling features.

OpenGraph tags are essential for rich link previews on social media platforms like Facebook, LinkedIn, and Twitter. If you’re building tools like link preview generators, SEO analyzers, or social content schedulers, scraping OpenGraph metadata is a must.

In this blog, we’ll show you how to scrape OpenGraph metadata from websites using Node.js with and without the help of OpenGraph.io, a powerful third-party API.

📘 What Are OpenGraph Tags?

OpenGraph tags are HTML meta tags that define how a web page should be represented when shared.

Here’s an example:

<meta property="og:title" content="OpenGraph Tags Guide" />
<meta property="og:description" content="Learn how to extract OpenGraph metadata with Node.js." />
<meta property="og:image" content="https://example.com/preview.jpg" />

🛠 Use Cases for Scraping OpenGraph Metadata

Generating link previews in chat or email apps
Building content aggregators
Previewing links in social media dashboards
SEO optimization tools

🧪 Method 1: Using OpenGraph.io (API-Based)

✅ Pros:

Fast and reliable
Handles redirects, JavaScript, and timeouts
Great fallback when a site blocks scraping

🔗 Step 1: Sign Up for API Key

Visit https://www.opengraph.io and get an API key.

📦 Step 2: Install Axios

npm install axios

🧑‍💻 Step 3: Fetch Metadata

const axios = require('axios');

const OG_IO_API_KEY = 'your-api-key-here';
const targetUrl = 'https://example.com';

const encodedUrl = encodeURIComponent(targetUrl);

const url = `https://opengraph.io/api/1.1/site/${encodedUrl}?app_id=${OG_IO_API_KEY}`;

axios.get(url)
  .then(response => {
    console.log(response.data.hybridGraph); // og:title, og:image, etc.
  })
  .catch(err => {
    console.error('Error fetching OpenGraph data:', err);
  });

🧪 Method 2: Scraping HTML with Node.js and Cheerio

If you don’t want to use a third-party API:

📦 Install Packages

npm install axios cheerio

🧑‍💻 Code Example

const axios = require('axios');
const cheerio = require('cheerio');

async function scrapeOGTags(url) {
  try {
    const { data } = await axios.get(url);
    const $ = cheerio.load(data);

    const ogTags = {};
    $('meta').each((_, el) => {
      const property = $(el).attr('property');
      const content = $(el).attr('content');

      if (property && property.startsWith('og:')) {
        ogTags[property] = content;
      }
    });

    console.log(ogTags);
  } catch (error) {
    console.error('Scraping failed:', error);
  }
}

scrapeOGTags('https://example.com');

⚠️ Note: This method might fail for dynamic websites rendered via JavaScript (e.g., React, Angular). In that case, consider using puppeteer.

🧪 Method 3: Scraping HTML with Node.js and opengraph-io

Install the opengraph library

npm install opengraph-io

Next create a function to get the metadata using the opengraph app id and the webpage to extract the data

let og = require('opengraph-io')({
    appId: 'abcxyz', //app id from opengraph-io
    cacheOk: true, // If a cached result is available, use it for quickness
    useProxy: false,  // Proxies help avoid being blocked and can bypass capchas
    maxCacheAge: 432000000, // The maximum cache age to accept
    acceptLang: 'en-US,en;q=0.9', // Language to present to the site. 
    fullRender: false // This will cause JS to execute when rendering to deal with JS dependant sites
});

function scrapeMetaData(webpage) {
    return new Promise((resolve, reject) => {
        return og.getSiteInfo(webpage)
            .then(function(metadata){
                resolve(metadata);
            }).catch((err) => {
                reject(err);
            });
    });
}

🚀 Bonus: Use Puppeteer for JavaScript-Rendered Pages

npm install puppeteer

const puppeteer = require('puppeteer');

async function scrapeWithPuppeteer(url) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle2' });

  const ogData = await page.evaluate(() => {
    const metaTags = [...document.querySelectorAll('meta')];
    const result = {};
    metaTags.forEach(tag => {
      const property = tag.getAttribute('property');
      const content = tag.getAttribute('content');
      if (property?.startsWith('og:')) {
        result[property] = content;
      }
    });
    return result;
  });

  console.log(ogData);
  await browser.close();
}

scrapeWithPuppeteer('https://example.com');

🧠 Conclusion

Whether you’re building a personal project or a production-level tool, scraping OpenGraph metadata is a foundational task. Here’s a quick summary:

Method	Tools Used	When to Use
OpenGraph.io	`axios`, API key	Quick, reliable, API-based scraping
Cheerio	`axios`, `cheerio`	Simple HTML scraping
Puppeteer	`puppeteer`	JavaScript-heavy sites