It is a common misconception that data science is a field only for PhD’s. Field of data science is a vast and welcoming playground for anyone with an aptitude for problem solving using data.
Data Science is the intersection of three key areas:
Math and statistics
Domain knowledge is just the knowledge of the industry or field one is working in. A financial analyst would have knowledge about the stock market. A hotel manager would have knowledge about the hospitality industry. A sales manager would have knowledge about what factors influence buying.
Computer programming is the ability to use code to solve various types of problems. Many of today’s programming languages have a syntax closely resembling general English. Picking up programming is not as difficult as it used to be.
Math and statistics is the use of equations and formulas to perform analysis. Many of these concepts are from school or college days. Of course one can dive deep into the various theorems and derivations. But many of today’s freely available tools handle these complexities behind the screen.
In order to gain knowledge from data, one must be able to utilize computer programming to access the data, understand the mathematics behind the models we derive, and above all, understand our analyses’ place in the domain we are in.
When starting off in the field of data science, one would rely on their core strengths depending on which area they come from. Someone with a background in programming would be exploring the application of code to derive mathematical models. Someone with a background in math would be trying out coding. Someone with domain knowledge would be creating hypotheses to test using code.
The intersection of math and programming leads to what is referred to as machine learning. However, one needs to be able to explicitly generalize models