Big News: FounderDating is joining OneVest to build the largest community for entrepreneurs. Details here
Latest Notifications
You have no recent recommendations.
Name
Title
 
MiniBio
FOLLOW
Title
 Followers
FOLLOW TOPIC

Question goes here

1,300 Followers

  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur

What are the Best Data Scraping Tools?

I am looking to scrape data from a few websites on a regular basis. Keep in mind that the budget is very modest.

What data scraping tools would you recommend (free or cheap), or do you have any contacts that you'd recommend for this purpose?

Thanks in advance for your feedback.

Lisa


16 Replies

John Dyrek
0
0
John Dyrek Entrepreneur
Medical Economics at Aetna
Kimono Connotate
Peter K Chen
1
0
Peter K Chen Entrepreneur
Software/Growth Engineer
import.io
Michael Brill
1
0
Michael Brill Entrepreneur
Technology startup exec focused on AI-driven products
As always, it depends on what you need to scrape, your skillset and budget. It's really a big world. I've tried and failed with products like import.io and Kimono and have written maybe 10-20 scrapers... it is highly dependent on your skillset and the nature of the sites you want to scrape. Some take 10 minutes others are basically impossible.

My quick recommendation is that you use Upwork et al to hire a contractor to write your scrapers. They are pretty easy to write if you have the skillset and you can get your basic site scraper for, say, $100.





Onikepe Adegbola, MD PhD
0
0
Global Head, Scientific Affairs, Clinical Trials at Quest Diagnostics
You might be able to find someone at seo clerks: https://www.seoclerk.com.
Not affiliated with them, but know someone who has used the service and been satisfied
Mark Watkins
0
0
Mark Watkins Entrepreneur • Advisor
Founder, The Hawaii Project
import.io can do this kind of thing. Solid tool if your needs are not super complex
Matthew Watson
0
0
Matthew Watson Entrepreneur
Principal, InfiniteIQ Consulting, LLC
I have used Outwit Hub with good results. Affordable and fairly straight forward to use.
Amol Umbarkar
0
0
Amol Umbarkar Entrepreneur
On to Next Venture
I found diffbot quite helpful. You can run tests on their site and see if your target site works well with there API.

If you can employ a tech resource then scrapy (python based) is good option.

However none of these tools can offer brainless scraping. You must tweak around to get results you are looking for.
Armando Vieira
1
0
Armando Vieira Entrepreneur
Data Scientist, entrepreneur, speaker
the R package rvest is very easy to use and do the work. In python there are plenty of them
Stefan Smiljkovic
0
0
Stefan Smiljkovic Entrepreneur
Founder at Vanila.io - Web Studio
There are a lot of tools you can use, but you need to have know technical knowledge.

- https://github.com/lapwinglabs/x-ray
- https://github.com/segmentio/nightmare
- https://github.com/n1k0/casperjs
- https://github.com/ariya/phantomjs

You can also reach me at www.vanila.io to give me more info what you want to scrape, and I will advice you on it.
Peter Johnston
0
0
Peter Johnston Advisor
Businesses are composed of pixels, bytes & atoms. All 3 change constantly. I make that change +ve.
There are two approaches here.

The first is batch - to do a scrape on a one-off or regular basis. This can be a chore, repeating the same task over and over.

The other is track - to dynamically link so that changes in the target site are reflected in the data you have access to.

Increasingly we are moving to this sort of dynamic linking. This again splits into two - those who would be happy for you to track them and those which would not.

For friendly dynamic linking, consider an API. Ask them to share data with you and often give them something in kind as the main payment - a commission, perhaps, or even just recognition of source.

One other thing to consider is doing your own modelling from the data. If you have either dynamic data or regular snapshots to create a timeline, you can start to see what the data is doing over time and predict what it might do in future. Eventually this can get good enough that you are almost in charge, being able to set the figure before they do and simply using their real-time data as confirmation.

As well as scraping tools, you may wish to look into dynamic linking tools and data modelling and prediction.


Join FounderDating to participate in the discussion
Nothing gets posted to LinkedIn and your information will not be shared.

Just a few more details please.

DO: Start a discussion, share a resource, or ask a question related to entrepreneurship.
DON'T: Post about prohibited topics such as recruiting, cofounder wanted, check out my product
or feedback on the FD site (you can send this to us directly info@founderdating.com).
See the Community Code of Conduct for more details.

Title

Give your question or discussion topic a great title, make it catchy and succinct.

Details

Make sure what you're about to say is specific and relevant - you'll get better responses.

Topics

Tag your discussion so you get more relevant responses.

Question goes here

1,300 Followers

  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
Know someone who should answer this question? Enter their email below
Stay current and follow these discussion topics?