Tired of writing complex code to scrape data from websites like Twitter, LinkedIn, or Instagram? What if you could automate all of that within a visual, drag-and-drop interface?
In this step-by-step guide, I’ll show you how to connect Apify (a premier web scraping platform) with N8n (a powerful no-code automation tool). By the end, you’ll be able to build automated workflows, or “AI agents,” that can scrape data from the web and send it anywhere—like a Google Sheet, a database, or an analytics tool—all without writing a single line of code.
We’ll use a real-world example: scraping replies from Twitter (X) tweets for analysis.
What You’ll Need:
- An Apify Account (Sign up here)
- An N8n Account (You can use the cloud version or self-host it)
- Two tweet URLs you want to scrape (optional)
Step 1: Understanding the Tools
- What is Apify? Think of Apify as a huge app store for web scrapers. These scrapers, called “Actors,” are pre-built tools that can extract data from almost any website—Twitter, Facebook, Google Maps, Amazon, you name it. Instead of building a scraper from scratch, you can just use a proven one.
- What is N8n? N8n is a workflow automation tool. It lets you connect different apps and services (like Apify, Google Sheets, Slack, etc.) using a simple visual editor. You define triggers (e.g., “every day at 9 AM”) and actions (e.g., “scrape data” then “update a spreadsheet”) by dragging and dropping nodes.
Today, we’re making these two tools talk to each other.
Step 2: Finding the Right Scraper on Apify
- Log in to your Apify account.
- You’ll land on the dashboard. At the top, you’ll see a search bar. This is where you find the “Actor” for your specific task.
- For this tutorial, we want to scrape replies from Twitter/X. Type in a relevant search term, like “Twitter reply scraper” or “Tweet replies.”
- Browse the results. You’ll see Actors for various platforms. For this guide, we’ll use a popular one simply named “Twitter Scraper” or similar. (Pro Tip: Look at the number of runs and ratings to choose a reliable Actor).
- Click on the Actor you’ve chosen to open its details page.
Step 3: Getting the API Endpoint from Apify
This is the most crucial step—getting the unique URL that allows N8n to trigger the Apify scraper.
- On the Actor’s page, look for a tab or button called “Integrations” or “API.” Click on it.
- Apify offers several integration methods. Scroll down until you find the option for “Run Actor synchronously and get dataset items.” This is the one we need.
- You will see a long URL (an API endpoint). Click the copy icon next to it. Save this URL; we’ll need it in just a moment.
- This URL is essentially a direct line from your N8n workflow to this specific Apify Actor.
Step 4: Setting Up the N8n Workflow
Now, let’s switch over to N8n to build our automation.
- In your N8n dashboard, create a New Workflow.
- Every workflow needs a starting point. Add a “Manual Trigger” node from the trigger panel. This allows you to start the workflow with a click of a button for testing.
- Next, we need to call the Apify API. Search for and add an “HTTP Request” node. Connect it to the Manual Trigger node.
Step 5: Configuring the HTTP Request Node
This is where we connect to Apify.
- Click to configure the HTTP Request node.
- Method: Select
GET
from the dropdown menu. - URL: Paste the unique API endpoint URL you copied from Apify in Step 3.
- Authentication: If the Apify Actor is private or requires an API token, you would enter it here under “Authentication.” For many public Actors, this isn’t always necessary, but check the Actor’s documentation.
- Send Body: Toggle this option ON.
- Body Content Type: Select JSON.
- JSON Body: This is where we tell the Apify Actor what to scrape. The required fields will be different for every Actor. For our Twitter reply scraper, we need to provide:
tweets
: An array of tweet URLs you want to scrape replies from.maxResults
: The maximum number of results you want back.
{ "tweets": [ "https://twitter.com/elonmusk/status/1234567890", "https://twitter.com/n8n_io/status/0987654321" ], "maxResults": 30 }
(Replace the URLs with the tweets you want to analyze)
Step 6: Testing the Connection
- Click “Execute Node” to run a test.
- If everything is configured correctly, you’ll see two things happen:
- In N8n: The node will execute successfully, and you’ll see a output panel filled with raw data. This is the scraped tweet reply data from Apify!
- In Apify: Go to your Apify dashboard and look at “Actor runs.” You will see a new run in progress or completed, showing the number of results scraped (e.g., 41 items).
Step 7: Understanding the Results
Back in N8n, you can click on the output data and switch to the “Table” view. This will neatly organize the data Apify returned, which typically includes:
- Reply text
- Username of the replier
- Like count on the reply
- Timestamp
- And much more!
What’s Next? Building Your Automation
Congratulations! You’ve successfully connected N8n and Apify. But this is just the first step. The real power is what you do next with that data. From your HTTP Request node, you can now:
- Send it to a Google Sheet: Use the “Google Sheets” node to append this data to a spreadsheet automatically.
- Perform Sentiment Analysis: Connect to an AI service like OpenAI to analyze whether the replies are positive, negative, or neutral.
- Get alerts in Slack: Use the “Slack” node to send a message with a summary of the scraped data.
- Schedule it: Change the “Manual Trigger” to a “Schedule Trigger” to run this scrape daily, weekly, or hourly.
The combination of Apify’s scraping power and N8n’s automation flexibility opens up a world of possibilities for no-code developers. What will you automate first?