Definitions are blurry, and so are skill requirements
There are many ways to define data science. The most popular one seems to be that data science sits at the intersection of computer science, maths & statistics, and domain knowledge.
It is always easy to criticise the commonly used Venn diagram above. However, keep in mind that they are purposefully oversimplified and therefore naturally flawed. Personally, I believe it is a useful way to conceptualize data science. If your work involves computer science (programming, databases, cloud infrastructure), math & statistics (statistics, stochastics, machine learning) and domain knowledge, all to a non-trivial extend, you are probably doing data science.
Data scientists do widely different things in practice
The problem is that this definition is very general. I’ve met data scientists who…
- are unable to use fundamental programming tools or techniques for their analyses
- have never trained a machine learning model
- are isolated from the real business, focusing primarily on data pipelines or performance optimization
On the other hand, I’ve met…
- Software engineers who train machine learning models
- Data analysts who build complex data pipelines using Python
- Business analysts who use advanced statistical models but have never thought of them as AI
Data science-related job roles can be quite confusing in the real world, because…
- There is significant skill overlap between similar roles (data analyst, data engineer, data scientist, machine learning engineer, AI engineer)
- Companies define these job roles differently depending on their industry and size
- People take on new responsibilities but stay in the same job, never changing their job title
- Job requirements for the same role change rapidly
If you are able to pull data from a data warehouse using SQL and visualize statistical insights using Python, this would have secured you a great job as a data scientist 10 years ago. Nowadays, you may still have a shot in a traditional organization like a large insurance company. However, if you are trying to join a unicorn tech startup as a data scientist, you better know how to train ML models, deploy them to the cloud, and set up monitoring and retraining mechanisms with data, model, and code versioning. If you have 10+ years of experience using ChatGPT, that’s another plus.
Finding your personal development path
I think the key insights from these observations is that you should focus your personal skill development on what brings business value, not what is required by some arbitrary definition of your current job title.
If you are solving relevant business problems, enjoy your work, and are well compensated, don’t worry about what others think the market demands from you.
Of course, you should strive to expand your skill set and in today’s world, staying in the same role at the same company for 10 years is rarely optimal for long-term skill progression. But if you have found a business niche where your personal skill set is highly valued, you can be sure that there are other companies with the same problem. Your job is to make sure you can solve this problem, now and in the future.
Comparing yourself to others can be useful, but also distracting. Others have different personalities and interests and are probably doing a completely different job than you. Programming, Machine learning, cloud platforms, etc. are only tools. Learn the tools that you really need to be competent at solving a specific business problem.