Introducing a Powerful Web Scraping Solution
Web scraping is essential for data collection, but it often comes with challenges like blocked requests, CAPTCHAs, and rate limiting. The n8n community has developed a solution that addresses these common obstacles - the n8n-nodes-scrappey integration.
This new node works as a smart fallback for your HTTP requests in n8n, automatically handling situations where your regular requests fail due to various protection mechanisms.
What Makes This Node Special?
The Scrappey integration stands out because it:
- Functions as a direct extension of n8n's HTTP Node
- Requires zero manual configuration
- Automatically retries failed requests through an advanced anti-bot network
- Maintains your existing workflow structure
Protection Systems It Bypasses
The node effectively handles multiple types of web protection:
- Cloudflare challenges and protection
- CAPTCHAs (hCaptcha, reCAPTCHA) with automatic solving
- Various scraping detection systems
- Datadome anti-bot protection
- Rate limiting and IP blocks
- JavaScript-heavy websites through browser simulation
Getting Started in Four Simple Steps
1. Install the Node
Navigate to Settings → Community Nodes → Install @nskha/n8n-nodes-scrappey
2. Obtain Your API Key
- Sign up at Scrappey.com
- Access 750 Direct requests + 150 Browser requests for free
- Create "Scrappey API" credentials in n8n with your API key
3. Connect as a Fallback
- Set up your normal HTTP Request node
- Connect the error output (red connector) to the Scrappey node
- Select "HTTP Auto-Retry" operation
- That's it - your requests now have a reliable fallback!
4. Explore Advanced Options (Optional)
For more control, the "Request Builder" mode offers:
- Custom headers and cookies
- Proxy targeting across 150+ countries
- Mouse movement simulation
- Session management
- CSS selector waiting
Three Flexible Operation Modes
The node provides three distinct ways to work:
- HTTP Auto-Retry: Perfect as a fallback for existing HTTP nodes
- Browser Auto-Retry: Full browser simulation with CAPTCHA solving capabilities
- Request Builder: Complete custom request control for advanced scenarios
Pricing Structure
- Free tier: 750 Direct + 150 Browser requests
- Scaling: €100 = 600,000 request credits (including proxies & CAPTCHA solving)
The credit system works as follows: - HTTP auto-retry costs 0.2 credits per request
- GUI browser request costs 1 credit per request
- Residential proxies & CAPTCHA solving are included at no additional cost
- Premium or mobile proxies may cost up to 3 credits
For many small businesses, the free tier provides an excellent starting point, while larger operations can scale efficiently with the paid options.
Why This Matters for Your Automation
Data collection is often the foundation of automation workflows. When HTTP requests fail due to protection mechanisms, entire workflows can break down. The Scrappey integration provides a safety net that keeps your data flowing even when faced with sophisticated blocking techniques.
By connecting this node to your existing HTTP requests, you gain:
- More reliable data collection
- Reduced workflow failures
- Access to previously inaccessible websites
- Time saved troubleshooting blocked requests
Real-World Applications
This integration can transform how you handle various automation scenarios:
- Price monitoring: Track competitor prices even on protected e-commerce sites
- Content aggregation: Collect data from news sites with anti-bot measures
- Lead generation: Extract contact information from websites with rate limiting
- Market research: Gather data across multiple protected sources
- Automated testing: Test your own websites' protection mechanisms
Feedback Welcome
As this node is still in beta, the developers are actively seeking feedback and use cases. If you encounter issues or have feature requests, you can open an issue on GitHub.
Getting More Information
For additional details and documentation:
- GitHub repository: n8n-nodes-scrappey
- NPM package:
@nskha/n8n-nodes-scrappey
- Full examples are available in the README
Turn your blocked HTTP requests into reliable data sources with this powerful integration that works seamlessly with your existing n8n workflows!