Full Title: Beware the Data Science Pin Factory: The Power of the Full-Stack Data Science Generalist and the Perils of Division of Labor Through Function
Document Note: But the goal of data science is not to execute. Rather, the goal is to learn and develop profound new business capabilities. Algorithmic products and services like and more can’t be designed up-front. They need to be learned. There are no blueprints to follow; these are novel capabilities with inherent uncertainty. With data science, you learn as you go, not before you go.
The generalist moves fluidly between functions, extending the data pipeline to add more data, trying new features in the model, deploying new versions to production for causal measurement, and repeating the steps as quickly as new ideas come to her.
Key ideas
Dysfunctional relationship between DS and DE is common in the industry. > What is your experience here?
Unless you have tons of people you get that sort of relationship. But that is not very efficient
Generalists able to use platform is the answer
Change mindshit to autonomous doers. Engineers work horizontally, DS vertically
How to make platform engieers to stay ahead of DS teams? > My proposal is to involve them in our decisions
We are sacrificing technical efficiency for velocity and autonomy. It is important to recognize this as a deliberate trade off.
DS they are well equipped to make trade offs between technical and support costs vs. requirements.
Algorithmic products and services like and more can’t be designed up-front. They need to be learned
This division of labor by function is so ingrained in us even today that we are quick to organize our teams accordingly. Data science is no exception (View Highlight)
las, we should not be optimizing our data science teams for productivity gains; that is what you do when you know what it is you’re producing—pins or otherwise—and are merely seeking incremental efficiencies. (View Highlight)