There’s this friendly game in Big Data frameworks. It’s what’s the fewest lines of code it takes to do WordCount. I’m a committer on Apache Beam and most of my time is dedicated to making things easier for developers to use Beam. I also help...
We’re coming on that time of year when many people make their goals for the next year. Before you do that, reflect on how you did this year. If you accomplished a goal, how did you do it? If you didn’t accomplish a goal, what happened? Many people wrote in...
Unit testing your Kafka code is incredibly important. It’s transporting your most important data. This is especially true for your Consumers. They are the end point for using the data. There are often many different Consumers using the data. You’ll want to...
Unit testing your Kafka code is incredibly important. It’s transporting your most important data. As of 0.9.0 there’s a new way to unit test with mock objects. Refactoring Your Producer First of all, you’ll need to be able to change your Producer at...
I’m often asked what I think will happen to Big Data over the next five to ten years. From a Developer’s point of view, they’re asking if investing their time in becoming a Data Engineer will pay off. We’re going to see a continuing maturity of...
You’re considering a change to become a Data Engineer. Why should you do it? Why shouldn’t you do it? Let’s consider some reasons. Should There is a major shortage of qualified Data Engineers. There is a high demand and low supply of qualified Data...