If you are a programmer, then I have some good news for you!
This applies if you are a Software/Application Developer or Engineer. As you read, keep in mind this two points:
First, a very competent programmer always have some edge over other skills. The reason is because a programmer is a problem solver. Therefore you are not limited to any particular field. We should be able to dive into any field and translate the concepts into algorithms (series of steps) and the then into a working computer program.
Second, a lot of Data Scientist do not really do much. This is true. They spend most of the time fetching data from one application/system and feeding it into another application. Then examining the outputs. These systems are developed by programmers! Besides, many Data Analysts think they are Data Scientists. But this is a misconception. We’ll probably talk about this in another article. Then we also talk about Machine Learning Experts.
Now let’s now looks at 5 Steps to becoming a Data Scientist.
Step 1 – Get Some Knowledge of Databases
Now, if you are like me, that has been a DBA (Database Administrator) for many years, then you can skip to the next point. Else, if you are just a programmer, then you need to spend few weeks to get your head around a DBMS. It could be ORACLE, MSSQL, MySQL or any other on.
You should focus on data aggregation using different function (AVG, COALESCE, NVL, STDEV, STDEVP, VARP, VARP etc). Also work on filtering data using different selection criteria.
Then you need to be really good with Joins in SQL – Read about Joins
I would would cover everything you need to know about SQL Joins in this lesson. And you will see it’s quite clear. A JOIN in SQL is used to combine record from two or more tables. The combinations is based on a given criteria. Also note that the two tables need be related in some way.
Once you are up to speed with SQL, then time to do some basic Math.
Step 2 – Review Programming – Related Math
I know, I know! Math is not always interesting! But then, this is a bridge you need to cross to become a professional in your field.
The interesting thing is that the Math you need is really not the toughest ones. So you don’t have to start learning Laplace Transforms, Fourier Series etc. But come on! As a programmer, you have to be smart and brainy!
However, you need a good understanding of Number Systems – binary, octal, decimal and hexadecimal. Learn how to convert a number from one base to another. More especially, how to convert between binary and decimal numbers.
Very important is a knowledge of Geometry. This is very important because, at the end of the day, you need to provide some visual representation of your data.
You also need a bit of logarithms and then you need some knowledge of Statistics. I do think Statistics need be top in the list. Then some Probability Theory is in order as well.
Step 3 – Get Used to Collection Types
In many programming languages, there a number of collection types you need to be able to use. For example, in Python, we have Sets, Lists, Tuples and Dictionaries.
Similarly, in Java, we have types like List, ArrayLists and so on. Then find out the methods available for these types and spend some time playing around with them. You can start with Python here.
Step 4 – Start Using Jupyter Notebooks and R (RStudio)
Jupyter Notebook is a free web-based IDE used by Data Scientist to build Machine Learning models. You can get it by installing Anaconda (a distribution of Python) for free. For Jupyter Notebook, I would recommend you acquire a good knowledge of Python Programming.
You can start with these ones:
- How to Set up Jupyter Notebook with Python 3 – Part 1
- How to Set up Jupyter Notebook with Python 3 – Part 2
- Getting Started with TensorFlow in Jupyter Notebook
R is another Machine Learning programming scripting language used for statistical computing. The interesting thing is that it is very easy to learn.
To get started with R, simple install RStudio and you are good to go. You can also follow my R Lessons below. I recommend you Subscribe to get updates
Step 5 – Have a Face-to-Face with Kindson The Genius!
As you may already know, I’m currently a research with strong passion for knowledge sharing. My research area Enterprise Application Integration with focus on Microservices.
I also carry out research on Machine Learning-based Modelling and Data Analytics. So as I mentioned, before, I’m very much interested in helping young professionals improve their skills in Data Science and Analytics.
I provide needed guidance for my subscribers from time to time in various areas. So feel free to reach me and request a face-to-face.
Thanks for your effort in learning and be sure that your hardwork would pay off!