Learning Python for Data Science from Zero (Part III — Advanced to Master Level)

Sharing from a finance professional who has established an AI startup.

ENGLISH ARTICLES

Hudson Ko

2/24/20235 min read

graphs of performance analytics on a laptop screen
graphs of performance analytics on a laptop screen

My Learning Journey in 5 Stages

  1. (Baby Level — Test your passion and basic ability)
    Data Camp’s “Introduce to Python” course

  2. (Beginner Level — Real Start)
    Data Camp’s “Data Scientist” career track

  3. (Intermediate Level — Certificates Hunter)
    Coursera’s “Applied Data Science with Python” by University of Michigan or equivalent courses on EdX

  4. (Advanced Level — Time to build something yourself)
    A Postgraduate degree OR/AND Learning by doing and post on Medium

  5. (Master Level — The More you learn, the Less you know)
    Build your own portfolio on GitHub

In this article, I will continue to share my personal learning journey from advanced to master level. If you are a beginner and do not know where to start learning Python for Data Science, you may go back to the article here. For intermediate level learners, please refer to the guide here.

4. Advanced Level (1–5 years)

You should give yourself a big hand if you have come to this stage. You have demonstrated huge determination and passion, which will define your success in the future.

Before introducing the learning paths for advanced level, let’s take a look of the below chart. Completing the courses on Data Camp, Coursera and EdX should give you an idea of these terms’ meaning and the difference among Artificial Intelligence, Machine Learning and Data Science. These three domains are formed by many sub-topics, which are closely related to each other.

It has now come to a critical moment — Time for you to choose a few topics to focus on, or in other words, choose to be a generalist or specialist. After learning Python/Data Science for some time, you should realize that there are so much under the domain, not even mention those frontend and backend knowledge, which are necessary if you want to become a developer. Time is limited, so you can only choose to deepen the knowledge in a few selected topics or broaden your understanding as much as possible within the AI/Data Science domain.

Before making the decision, you should ask yourself:

  • What are you truly interested in?

  • What do you want to do after learning Python and Data Science?

Your final target will shape your learning path in the future, not only about the learning sequence but also the destination. Below are three possible ways to continue learning in advanced level.

  1. Study a Master’s degree if you just want to outperform 95% of people in non-IT industry and work as a Data Scientist in non-IT company. A Master’s degree from a reputable university can distinguish yourself well from those who just completed online certificates from the HR’s eyes and in terms of qualifications. However, it doesn’t necessarily mean that practically. (If anyone is interested to know how to choose a decent Master of Data Science program, please let me know. I am more than happy to write an article to talk about that.)

  2. Study a PhD degree if you want to be an expert in a specific domain of Data Science / Artificial Intelligence. Computer Vision, NLP and Reinforcement Learning are some topics you can choose to research for your PhD journey. Only people with PhD degree can work as a Data Scientist or Artificial Intelligence Engineer in IT companies (of course, there are very few exceptions), especially if your dream is to work in Google and Microsoft. Excited about AlphaGo from DeepMind? Get a PhD first.

  3. Learning by doing and become a Medium Writer if you are tired to study regularly in a structured way and want to do something interesting on your own. There are still lots of materials not covered in the previous courses, for example, web scraping, geographical information system (GIS), blockchain data and cloud computing, etc. You should now have the foundation to initiate some mini-projects based on your own interest.

    Learning by doing is definitely a good way to boost your hands-on experience. When you meet troubles and do not know how to solve the programming problem, just google it and you will find the answers on Stack Overflow for 99.9% cases. If not, read the documentation of the libraries you are using.

    An important step to make this learning path successful is to record your journey on Medium or a blog and share them publicly. Towards Data Science, Analytics Vidhya and Towards AI are some of the famous publications on Medium that you can submit your articles to. On one hand, it generates some income when people read your articles. On the other hand, this is building your profile and shows that your learning journey never dies. “A writer who published 20+ articles about Python/Data Science/AI with more than 100K views” is definitely a cool experience to put on your CV.

5. Master Level (Lifelong)

Frankly speaking, I should not call this stage as Master Level, but Real Baby Level. The more you learn, the less you know.

Technology is evolving every day, Artificial Intelligence/Data Science also. What you learn today may be completely outdated 1 year later. Therefore, it’s an endless learning journey. Get well prepared for that mentally and physically.

Sorry to say, I may not be the most suitable person to introduce the learning path for this level because I am not working as a Data Scientist in the IT industry, not even in financial sector. But I could share my opinion based on my experience as an AI startup’s boss. (Yes, I co-founded a startup during my study in the Master of Data Science program, which is completely another story.)

Github Profile

When I interview the applicants for full-time Data Scientist role or Data Science internship, after screening their CVs, the next thing I would do is to browse their Github profile to see what projects they are working on and their coding levels. (Adding your Github profile link to CV is a basic for everyone who wants to apply for coding-related positions.) This is a common practice for every interviewer from IT company. Therefore, please organize your Github profile tidily and try to demonstrate some projects that are highly related to the positions you are applying for.

  • If you are applying for a Machine Learning Specialist role, please show that you are able to use the Tensorflow or PyTorch framework and implement different types of neural networks.

  • If you are interested in Computer Vision, please work on some projects using Convolutional Neural Networks and libraries like OpenCV.

  • If you are looking for a job focusing on Natural Language Processing, please include some projects about Transformers and Sentiment Analysis.

Don’t tell me that your Github only includes a code line to print “Hello World” when you are applying for a Facebook Data Scientist role.

Besides showing your ability to handle different types of projects, it would be great if you are able to write codes clearly and efficiently.

  1. Organize the codes, scripts and folders logically. Never name a variable without any meaning. Of course, i, j, k inside a for-loop is acceptable, but what about aabbcc? Codes in your Github profile show your personality and working habit.

  2. Write codes that can run efficiently in terms of time and space complexity. You may never care this before, but this decides whether you can become a top-tier programmer. Some people just focus on whether the codes they write can perform well, while some are paying attention on how to improve the codes’ efficiency so that they can run in a shorter period of time and occupy fewer space. It depends on your experience and (unfortunately) whether you are born with some programming talents.

This comes to the end of the sharing of my learning journey of Python for Data Science from Baby Level to Master Level. Feel free to let me know if you get any questions during learning or if there are any specific parts you want me to share more details on.

Learning Python and Data Science is a lifelong journey. Maybe one day, a new programming language will replace Python, and a fancier name will appear for Data Science. It doesn’t matter. The things you learnt will never become useless. Your determination and passion will lead you to success.

Persistence is always the most important thing to learn.