Why I wanted to be a data scientist

Posted by OMEGA MARKOS on April 28, 2020

It was the summer of my freshman year in college. As I was preparing to choose my Major, I attended an orientation by program chairs. It was then that I learned the potential of statistics, and how it could make a real difference in our society. The Chair’s passionate speech coupled with my deep appreciation for numbers swayed me to join the statistics department.
In my second year, I got a chance to take programming. I really liked the course. However, as a Statistics major, it was the only computer science course I was allowed to take. The next course was open solely to those with computer majors. The only option I had was to attend the lab with my computer science major friends. But after a few weeks, I was unfortunately forced to stop the lab because of limited space. That is how I fell In love with coding!
After graduating, I joined the Nation Statistics office as an assistant researcher & got involved in different researches. My favorite one was the Demographic & Health survey. The survey was about women & children’s health. I was involved in each & every part of that survey & being the only women in the team, I had a lot of responsibilities. But that didn’t stop me from going to the data processing unit & evaluating the coding process. I even attended a CsPro coding workshop with the data processing team. And that put me on an important spot for the survey. I knew the survey very well & at this point, I kind of knew how to code. The most amazing thing is that Policymakers use the survey results to make decisions to improve women’s & children’s health. Seeing other people’s lives improve was priceless. Especially the children’s.
In my experiance, I saw a lot of shortcomings in my research job. Two different people were working on data where one person only knows how to code & the other only knows the data very well but with no coding experience. In data analysis, you really need the knowledge of the data you are working with. The process has a lot to do with grouping and aggregating different variables to help answer your questions. The meaning of those variables depends on how the data was collected and what question was asked during data collection. If you don’t have that knowledge, you can just group & aggregate different variables (columns) & get answers that may look right but are, in reality, misleading results. When it comes to large data it is very hard to prove your results & sometimes this leads to an improper decision. It is obvious that the person who involved in getting & collecting the data has more sensitivity when it comes to the meaning of the values than coder with less involvement.
I always wanted to complete the circle of processing & analyzing data myself, but I was missing my coding predictive modeling skill. Data science is working with big data & telling a story that cannot be told otherwise. That is why I wanted to be a data scientist.