Sometimes I’ll train at a company that’s creating a data engineering team. The team often includes a Data Scientist. I’ll always make a note to talk to the Data Scientist about their experience and interactions with the team before I arrived. These...
You don’t have previous Big Data experience, but want to get hired as a Data Engineer. Don’t worry, you can get hired. You’ll need a well executed personal project that gets you noticed and shows your skills. I’ve verified this with hiring...
I’m open sourcing one of the modules I wrote for my Real-time Data Engineering class. We use Apache Spark and Apache Kafka to process data. Then, we show the data in real-time on a webpage using this JavaScript module to pull in data from Kafka via the Kafka...
In my book Data Engineering Teams, I separate out programming as a different skill than distributed systems. The section is the “Skills Needed in a Team” and talks about the various skills that a data engineering team needs. Several people have emailed me...
I’m really tired of seeing Big Data projects fail. They fail for both technical and managerial reasons. They all fail for similar reasons and that’s just sad because we can fix or prevent them. Gartner’s research shows that 85% of Big Data projects...
I’ve been teaching Kafka at companies without the textbook definition of Big Data problems. They don’t have, and will not have in the future, what you’d define as Big Data problems. As a result, the students ask me if using Kafka is appropriate for...