Previous Lecture Complete and continue  

  1.3 What will we install exactly? Why Python, R, SQL and bash?

You can find the O'Reilly course that I mentioned in the video: here.

TRANSCRIPT:

If you are the type of person who likes to see the details before moving forward, here is the list of the tools that we will set up in this tutorial:

  • bash (and mcedit),
  • Python 3 (and Jupyter) and
  • postgreSQL (and pgadmin4)
  • R (and RStudio)

But still, the question is:

Why do we care so much about Python, R, and SQL in data coding?

Well, mostly because these are the industry standards. If you go through job descriptions for data positions, in 99% of the cases you will find either Python, R, SQL or a combination as required languages.

The reason why they became so popular is that Python, R and SQL are very practical and easy-to-use languages. You don’t have to be a developer or an engineer to learn them fairly quickly. If you invest in learning these languages first, it will be much easier for you to adapt to other tools or languages in the future.

One more thought about Python and R: because of their popularity, they are developing really fast - there are many, many cool and free add-ons for both of them, so nowadays there is practically no existing data science task that you couldn’t do with them.

If this is not enough reasoning for you, you can check out my O’Reilly video course, called Data Science Fundamentals for Marketing and Business Professionals, where I talk a little bit more about data languages - and other exciting things related to data science. See the link in the description.

Did you realize I didn’t talk about the fourth language: bash? Bash is not a data language, but I would still add it to the list, because it’s just as easy as the other three - and knowing bash will give you a lot of freedom when it comes to moving files, changing your data environment, doing small data formatting tasks and other activities.