3 Secrets I learned working as a Machine Learning Engineer

data science tools in demand
Doing certifications is not enough, you need to do some hands-on (no, this is not one of the secrets). So, even though I was a trained Machine Learning Engineer, until I started working with CloudxLab I didn't know these 3 secrets which every Machine Learning Engineer and Data Scientist should know.

The 3 secrets are actually 3 tools that every Data Scientist/Machine Learning Engineer should have in their arsenal. Unfortunately, these 3 tools are probably the 3 most overlooked tools, so, this post. What are these tools?

1. Linux Command Line
Most Data Scientist take the Linux Command Line for granted, however, this is one of the most powerful tool you have ever used. So much so that this has to power to make or break your day.

Learn the Linux Command Line by heart, and not just the basic commands but more advanced Linux commands/programs like sed, grep, awk, regex, and vi. sed can help you parse and transform text from the command line, grep uses regex to search for text patterns, awk is used for text processing, data extraction, and reporting, regex can be used in combination with grep to find text patterns, and finally vi is a popular and commonly used text editor.

These are especially handy when you are working on NLP projects.

2. Git
Git is an open-source version control system.

I am sure you have heard legendary stories about lost data, this tool can make your life a lot easier by helping you maintain versions of your work online on GitHub/GitLabs. In combination with the Linux Command Line utility, learning Git is a must no matter which organization you are working with, or whether you are freelancing.

3. SQL
When I talk about SQL, I am not talking about any database in particular. What I am talking about is a language using which you can manipulate and query your data. I cannot emphasize how important this can be considering the first part of every Machine Learning Project is EDA or Exploratory Data Analysis. And I am sorry to burst the bubble, but data does not comes only as CSV files like you were taught while you were doing your certification course. More often than not it is stored in a database, and if you know SQL you are already ahead of the game.

I wish I would have known these before I started working as a Machine Learning Engineer. However, it's never to late to start learning these Data Science tools in demand!