███████████████████████████████████████
███░░▒▓▓▓ AUTOMATE ▓▓▓▓▓▓▓▓▓▓▓▒░░███
Automation

Web Scraper and Airtable Integration with Next.js

2 min read
intermediate

The Problem

Imagine you want to keep track of the templates available on the Webflow website. You’re interested in monitoring when new templates are added and when existing templates are removed from the homepage. Manually checking the website and updating a spreadsheet would be a tedious and time-consuming task. That’s where our Next.js web scraper and Airtable integration come into play.

The Solution

We’ll build a Next.js application that scrapes the Webflow templates page, extracts relevant information, and stores it in an Airtable base. The application will consist of two main components:

  1. Webflow Scraper API Route:
  • The webflow-scraper.js file defines an API route that fetches the HTML content of the Webflow templates page using the axios library.
  • It then uses the cheerio library to parse the HTML and extract the desired data, such as the template name, URL, and publish date.
  • The scraped data is returned as a JSON response.
__wf_reserved_inherit

2. Compare and Update API Route:

  • The compare-and-update.js file defines another API route that compares the scraped data with the existing records in the Airtable base.
  • It fetches the scraped data from the Webflow Scraper API route and retrieves all records from the specified Airtable table.
  • It compares the scraped data with the Airtable records to identify new templates and templates that have been removed from the homepage.
  • For new templates, it creates new records in the Airtable table.
  • For templates that exist in Airtable but are no longer on the homepage, it updates the “Last Time on Home Page” field with the current date.
__wf_reserved_inherit

Airtable Integration

To integrate with Airtable, we use the airtable package and configure it with our Airtable API key and base ID. We define the necessary field IDs and table name in the lib/airtable.js file, making it easy to reference them throughout the application.

Scheduled Execution

To automate the scraping and comparison process, we utilize the Vercel Cron Jobs feature. By adding a vercel.json file with the desired cron schedule, we can configure the compare-and-update API route to run at specific intervals (e.g., every 12 hours). This ensures that our Airtable base stays up to date with the latest changes on the Webflow templates page.

__wf_reserved_inherit

Share this article:

Related Articles

Back to all articles