During my interviews for another case study in Data Teams, I was introduced to a concept I teach but hadn’t heard so brilliantly stated. The case study was with Justin Coffey and François Jehl, who were both at Criteo. Justin introduced this concept of keeping...
As a distributed systems person, I’m used to figuring out how to spread a problem out to the most number of computers possible. Spreading out a problem lets me leverage my resources far better and faster. However, we’re failing to apply this optimization...
Data teams require all of their parts to be complete and succeed. When one of the teams of a data team is missing, the other teams will suffer. Often, organizations or team members don’t understand what’s happening when a team is missing. They blame...
I’m thrilled to announce that Data Teams: A unified management model for successful data-focused teams is available for purchase! My goal is to drive a real increase in the percentage of successful big data projects. Data Teams represents years of work and...
Kafka 2.0 added a new poll() method that takes a Duration as an argument. The previous poll() took a long as an argument. The differences between the two polls don’t stop there. You should know about the differences before porting your poll from a long to a...
In my last post, I gave some general suggestions on how analytics and data engineering teams should be dealing with COVID-19. Now, I want to give specific advice on how data engineering teams can reduce their cloud costs. Note: while this post focuses on cloud,...