Big News: FounderDating is joining OneVest to build the largest community for entrepreneurs. Details here
Latest Notifications
You have no recent recommendations.
Name
Title
 
MiniBio
FOLLOW
Title
 Followers
FOLLOW TOPIC

Question goes here

1,300 Followers

  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur
  • Name
    Entrepreneur

R vs. Python?

Both R and Python are popular languages used to perform data analysis tasks. From what I understand, Python is a great general-purpose language, and R's functionality is developed specifically with statisticians in mind. I've heard people argue both sides, but I wonder which is better for daily use?

21 Replies

Hasan Diwan
2
0
Hasan Diwan Entrepreneur
contract Data Scientist to several startups
As with most such questions, it depends. Python was designed by a computer scientist; R by statisticians. The personalities of the designers of each shine through in their use.
Felipe Ruiz
0
0
Felipe Ruiz Entrepreneur
IT Consultant
I'm learning -- taking notes
Sridhar Yerramreddy
1
0
Sridhar Yerramreddy Entrepreneur • Advisor
Founder & CEO at Oculus Health
It depends on what you mean by "daily use".. Here are a couple of scenarios:

1. If you are building a generalized web platform that has more user engagement use-cases outside of data and statistical dashboards, then Python is going to be more resourceful as it has full stack web frameworks that can assist with web development and provides a productive/superior eco-system than R for web dev.

2. If your daily chores and product require a lot of data analysis and predictive modeling based on large sets of data, I'm biased that R has a better usage and easier to attain your goals.






Ana Maria Echeverri
2
0
Visual Analytics, Predictive Analytics, Enterprise Software
I would say it depends on what you are trying to do. I use both R and python+scikit-learn. If I am just doing statistical modeling or data mining I prefer to use R. If however I need the analysis to be part of a web app I prefer to use Python. But the bottom line is I can probably achieve the same results from the analysis perspective using either one. Ana
Benjamin Olding
4
0
Benjamin Olding Advisor
Co-founder, Board Member at Jana
I did a phd in statistics. Everyone used R. I didn't know R (I was not a stats undergrad), and it seemed magical: everyone was using it to solve everything. So, I invested time learning it.

I was pretty disappointed. It really seemed like the result of a small community only knowing a single scripting language. You can do pretty much anything with pretty much any language. Why would you want to though? This isn't a case of best tool - it's just the only script tool for that community (or was at the time - I think it's changing, mercifully).

If you already know R and can accomplish a task with a R and you don't know python, I can't see a reason for you to not just use R to solve your problem.

If you already know python, then check out pandas and numpy/scipy. When I was in grad school, these tools didn't exist, and as a result, I would have told you then that it made more sense to use the packages already in R than code the specialized routines you needed in another language. Even so, R is just awful at manipulating data; I'd usually manipulate the data into the form I wanted outside R, then use read.table to read it in and pass it through the least amount of R code I needed to get the analysis done. I was hardly alone: in fact, many of my fellow grad students just wrote everything in C++ for their dissertation, using R just as a way to easily bang out graphs when needed.

Now that these python-based tools and libraries exist, however, I see no reason for a python programmer to not turn to them first, regardless of what you may hear about R.

If you do not know either R or python, please just learn python with pandas; this is the future. There is nothing inherent to the R language that makes it superior - it just has a lot of packages already written for it. However, that advantage decreases every day as more people contribute to pandas and numpy. I love stats - but the ideas behind statistical analysis aren't "owned" by a programming language. Python didn't really exist when S was created (the precursor to R). S+ and then R had real advantages over other script-based languages for a long time. It's just no longer the case.
Python can realistically be used for 20 other things, unlike R, and the reality of analysis is usually that more than 50% of the work is getting the data into a usable form. R just fails at this. As a result, I used a lot of awk and sed; but python will get things done too. I only turned to awk and sed because R was so terrible at manipulating real-world raw data. R does a fine job at analysis once you have things in table form, but it doesn't do a better job at it than python if the routine exists in both languages (and, unless you're doing something pretty obscure at this point, it likely does).

I really don't see a trade-off on this one. Unless you already know R for some reason, I believe the answer to your question is python, full stop.
Hasan Diwan
1
0
Hasan Diwan Entrepreneur
contract Data Scientist to several startups
Dr Olding, The gamlss package for R has no equivalent in python. And the plotting tools are primitive. There's no python equivalent for RGM[1]. -- H
Tom Maiaroto
0
0
Tom Maiaroto Entrepreneur • Advisor
Full Stack Consultant
Actually, I find Go perfect for working with and pushing around big data on the web (I think it has specific benefits with regard to networking and parallel processing, but there are many additional benefits as well)...But if you are choosing only between R and Python for big data it honestly depends. Python is likely going to have a much larger community and ecosystem for packages that you may be able to leverage.

That really means a lot for a business. R is great, but if it's too obscure and everything must be done from scratch or you have a hard time hiring programmers then is it really worth it? To be frank, it's more for math or academics and less for building a business.

That's another reason why I reach for Go as well - it's gaining a lot of traction and there are a lot of packages, but most important of all...It's fast to build things with. It is wonderful for building an application for business and fast.
Dan Oblinger
2
0
Dan Oblinger Entrepreneur
Founder at AnalyticsFire
I second Benjamin's opinion. scripting in a general purpose language which has libraries like pandas in it, is nearly always a better experience than working is a special built langauge that after the fact was extended to be a general purpose language.

Just one example to illustrate the point. In R, certain operations on a DataFrame object will result in other lower dimensional objects, and sometimes not. I think the rules originated when the operators were specialized statistical steps. Since then R is extended to handle all the things general purpose languages do, but not in a simplest, cleanest way. In Python the entire structure was created clean, then the Panda DataFrame was added, but it does not 'pollute' operations (like textual manipulation of data in a file).

Hasan, noted that Python graphing is primitives compared to R. I do agree on this point.
I generally write up a small python function that dumps the R statements into a file in /tmp and then invoke R on that function. (Once this is done, that graphing tool is available directly within python.)

Hasan also noted other statistical functions that R has that python does not. Certainly true, but if you listed the algs in scipy and scikit-learn I am positive there would be many not found in R.

My only disclaimer I am not a hard core stats guy. I am doing ML, and lots of data preprocessing.
So I cannot assess the completeness of the Python environment from the perspective of a stats guy.
--dan
Shobhit Verma
2
0
Shobhit Verma Entrepreneur • Advisor
building an adaptive recommendation engine
I got degrees in Statistics as well as Computer Science. I love and use R for exploration and once I have played with the data and figured out what model would generalize best, I use python to create a production version algorithm that scales.
If you do not want to learn python you may be able to go very far using Revolution Analytics support. However, I just prefer rewriting in python as it allows me to be more in control of the various optimizations at scale.
Bojan Tunguz
1
0
Bojan Tunguz Entrepreneur
Chief Data Scientist at Tunguz Consulting LLC
Another consideration might be performance. In my experience Python is much faster than R, which can be a serious issue for large data sets.
Join FounderDating to participate in the discussion
Nothing gets posted to LinkedIn and your information will not be shared.

Just a few more details please.

DO: Start a discussion, share a resource, or ask a question related to entrepreneurship.
DON'T: Post about prohibited topics such as recruiting, cofounder wanted, check out my product
or feedback on the FD site (you can send this to us directly info@founderdating.com).
See the Community Code of Conduct for more details.

Title

Give your question or discussion topic a great title, make it catchy and succinct.

Details

Make sure what you're about to say is specific and relevant - you'll get better responses.

Topics

Tag your discussion so you get more relevant responses.

Question goes here

1,300 Followers

  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
  • Name
    Details
Know someone who should answer this question? Enter their email below
Stay current and follow these discussion topics?