Scrape OpenGraph Metadata with Node.js and OpenGraph.io: A Practical Guide

Learn how to extract OpenGraph tags and metadata from websites using Node.js and the OpenGraph.io API. Includes examples for building social preview tools and link unfurling features.

OpenGraph tags are essential for rich link previews on social media platforms like Facebook, LinkedIn, and Twitter. If you’re building tools like link preview generators, SEO analyzers, or social content schedulers, scraping OpenGraph metadata is a must.

In this blog, we’ll show you how to scrape OpenGraph metadata from websites using Node.js with and without the help of OpenGraph.io, a powerful third-party API.


πŸ“˜ What Are OpenGraph Tags?

OpenGraph tags are HTML meta tags that define how a web page should be represented when shared.

Here’s an example:

<meta property="og:title" content="OpenGraph Tags Guide" />
<meta property="og:description" content="Learn how to extract OpenGraph metadata with Node.js." />
<meta property="og:image" content="https://example.com/preview.jpg" />

πŸ›  Use Cases for Scraping OpenGraph Metadata

  • Generating link previews in chat or email apps
  • Building content aggregators
  • Previewing links in social media dashboards
  • SEO optimization tools

πŸ§ͺ Method 1: Using OpenGraph.io (API-Based)

βœ… Pros:

  • Fast and reliable
  • Handles redirects, JavaScript, and timeouts
  • Great fallback when a site blocks scraping

πŸ”— Step 1: Sign Up for API Key

Visit https://www.opengraph.io and get an API key.

πŸ“¦ Step 2: Install Axios

npm install axios

πŸ§‘β€πŸ’» Step 3: Fetch Metadata

const axios = require('axios');

const OG_IO_API_KEY = 'your-api-key-here';
const targetUrl = 'https://example.com';

const encodedUrl = encodeURIComponent(targetUrl);

const url = `https://opengraph.io/api/1.1/site/${encodedUrl}?app_id=${OG_IO_API_KEY}`;

axios.get(url)
  .then(response => {
    console.log(response.data.hybridGraph); // og:title, og:image, etc.
  })
  .catch(err => {
    console.error('Error fetching OpenGraph data:', err);
  });

πŸ§ͺ Method 2: Scraping HTML with Node.js and Cheerio

If you don’t want to use a third-party API:

πŸ“¦ Install Packages

npm install axios cheerio

πŸ§‘β€πŸ’» Code Example

const axios = require('axios');
const cheerio = require('cheerio');

async function scrapeOGTags(url) {
  try {
    const { data } = await axios.get(url);
    const $ = cheerio.load(data);

    const ogTags = {};
    $('meta').each((_, el) => {
      const property = $(el).attr('property');
      const content = $(el).attr('content');

      if (property && property.startsWith('og:')) {
        ogTags[property] = content;
      }
    });

    console.log(ogTags);
  } catch (error) {
    console.error('Scraping failed:', error);
  }
}

scrapeOGTags('https://example.com');

⚠️ Note: This method might fail for dynamic websites rendered via JavaScript (e.g., React, Angular). In that case, consider using puppeteer.


πŸ§ͺ Method 3: Scraping HTML with Node.js and opengraph-io

Install the opengraph library

npm install opengraph-io

Next create a function to get the metadata using the opengraph app id and the webpage to extract the data

let og = require('opengraph-io')({
    appId: 'abcxyz', //app id from opengraph-io
    cacheOk: true, // If a cached result is available, use it for quickness
    useProxy: false,  // Proxies help avoid being blocked and can bypass capchas
    maxCacheAge: 432000000, // The maximum cache age to accept
    acceptLang: 'en-US,en;q=0.9', // Language to present to the site. 
    fullRender: false // This will cause JS to execute when rendering to deal with JS dependant sites
});
function scrapeMetaData(webpage) {
    return new Promise((resolve, reject) => {
        return og.getSiteInfo(webpage)
            .then(function(metadata){
                resolve(metadata);
            }).catch((err) => {
                reject(err);
            });
    });
}

πŸš€ Bonus: Use Puppeteer for JavaScript-Rendered Pages

npm install puppeteer
const puppeteer = require('puppeteer');

async function scrapeWithPuppeteer(url) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: 'networkidle2' });

  const ogData = await page.evaluate(() => {
    const metaTags = [...document.querySelectorAll('meta')];
    const result = {};
    metaTags.forEach(tag => {
      const property = tag.getAttribute('property');
      const content = tag.getAttribute('content');
      if (property?.startsWith('og:')) {
        result[property] = content;
      }
    });
    return result;
  });

  console.log(ogData);
  await browser.close();
}

scrapeWithPuppeteer('https://example.com');

🧠 Conclusion

Whether you’re building a personal project or a production-level tool, scraping OpenGraph metadata is a foundational task. Here’s a quick summary:

MethodTools UsedWhen to Use
OpenGraph.ioaxios, API keyQuick, reliable, API-based scraping
Cheerioaxios, cheerioSimple HTML scraping
PuppeteerpuppeteerJavaScript-heavy sites

πŸ”— Resources

Leave a Reply