This post is a transcribed Ask-Me-Anything (AMA) focused on growing a career in machine learning, hosted by Rick Shih. Rick is a leader in developing solutions connecting machine learning with production data science teams. With a master’s degree in Machine Learning from the University of California San Diego, Rick started and led the machine learning team at Bloomreach, a company focused on improving e-commerce search and personalization. He is currently the Machine Learning lead at TruEra focusing on ML explanations and the data science life cycle for tabular, time series, NLP and beyond.
Q: When you started the ML team at Bloomreach, how did you find early wins to translate ML to business value and grow the team?
A: To answer getting early wins, there are quite a few avenues to do this. The first thing you need to do is get a baseline that you can compare to. This can be existing non-ML methods or simpler linear methods. Once you have a baseline in place you can start the iterative development process and compare the growth of your newest initiatives.
To growing teams, there are a few areas that you want in a machine learning team. Experience: The first hires we made were heads of data science, so that you can take the learnings that many other companies have as there is a lot of experimentation and learned lessons that are great to not have to learn the hard way. Creativity and Research: Once you are expanded beyond the simple baseline models, there opens up a huge array of open literature and things you want to try out. You need to then use your creativity and research skills to find the ones most applicable to your problem. Diligence: ML is a data science, and with that comes a lot of data issues. Once you make your models and start assessing their performance, it quickly goes into what issues your models have. You need to go into the data and figure out what patterns your model is not good at, or if there are data issues impacting how your models work.
It is also important to have a combination of data science skills and engineering skills. Soon after experimentation, you need to see how you can actually connect your models to live production data. in the search world, we needed to figure out a way to connect model weights to the elastic search, and reverse indexing Apache Lucene systems. Many times the key performance indicators that your models are trained on need to be slightly deviated to match the systems that are getting the real data. If your company is large enough data science and engineering can be two teams, but for many startups these teams are at minimum closely interconnected and often make up one data science team.
Q: How is search ranking different from other kinds of machine learning? Which unique skills do you need for building search ranking models?
A: Search ranking differs by both model inputs and outputs. A ranked list is neither a classification or regression problem. Typically list ranking performance is shown by Normalized Discounted Cumulative Gain (NDCG). However, NDCG is not a valid modeling loss function, so you need to change the frame of the problem to fit. Typically you expand your training set by n-squared and do an item-to-item comparison. Then, you would convert that to either a classification (item A is better than item B) or a regression which can reconstruct a list ranking. The inputs may vary as well. You can have features that are tied to the item, related to personalization of the person who the item is relevant for, or you can have global activity data. You can also have text data as well as part of the modeling process.
For skillsets, base ML skills for any domain is the best place to start. Then you can specialize. for search ranking, there is a Learning to Rank field that explores and grows constantly. The same is for any specific domain. Understanding the fundamentals first, and then specializing is the best avenue.
Q: How is machine learning at a start-up whose product is built for data scientists different than at companies with more traditional value propositions?
A: The main difference is that at TruEra, we don’t build the models ourselves. However, we still need to know the same things. We cannot build a product for data scientists without understanding why they made the modeling or productionization decisions they did. We need to be intimately connected to the data science process, while at the same time not exactly doing it ourselves. Otherwise, we actually need similar skillsets. We should know the nuances of architectures and stay current with the newest modeling techniques. We also want to be able to automatically advise and help with areas that are traditionally done manually in data science, like scouring data, making modeling upgrades, and experiment tracking.
Q: For those just starting their data science career, how should they think about getting and demonstrating base ML skills?
A: One thing I like to make clear is that data science is as broad as computer science. It is a good idea to spend some time understanding what part of the field you want to be in. In Data Analyst work flows, you are focused on figuring out data trends which can both be evaluative or for feature generation. You can also be a modeler – a unique position where you need to understand performance metrics, modeling techniques, architectures, and more. Modeling is a starkly different work area where many times there is no “right” answer and you have to be very comfortable with experimentation and trial and error. Research-oriented positions are another avenue. For these, understanding the math and algorithms are crucial, including linear algebra, statistics, and sometimes calculus. Or really understand the nuances of the architectures. The area of ML infrastructure is huge as well. Understanding the large data world including databases for ML, distributed compute, productionization monitoring, deployments, GPU usage. There are plenty of jobs in high demand here.
Once you have identified which area of data science to pursue, there are resources everywhere. Traditionally, undergraduate and graduate education is great for the researcher and modeler positions. You can stay up to date with online courses or staying involved in the data science community and academic literature. There are also data analyst certifications as well provided by many industry companies like AWS, Intel, Microsoft and more.
Another way to advance is to have a good overlap of skillsets. I know people with PhDs in genetics who then transitioned to the data science world and work in the health insurance industry. This is also common with with other hard sciences such as: physics, biology, health, and energy.
That concludes this ML careers ask-me-anything with Rick Shih. This AMA originally took place in the AI Quality Forum, find this and more resources to help grow your career in machine learning there.