There’s an elephant in the room with Big Data. If an organization tries to half-ass their way through a Big Data project, they’re going to fail (usually a 5-10% odds of success). Given this really low success rate, should you even do Big Data? When I...
Unit testing your Kafka code is incredibly important. I’ve already written about integration testing, consumer testing, and producer testing. Now, I’m going to share how to unit test your Kafka Streams code. To start off with, you will need to change your...
As people start with Big Data, they go through the list of necessary skills. One of those crucial skills is to program. The question arises — how good does a person’s programming skills need to be? This is because programming skills are on a wide...
In my book Data Engineering Teams, I talk about a skill that’s often overlooked and unknown to data engineering teams. Teams often don’t know they need a veteran, think they can’t afford a veteran, or don’t understand why you need a veteran on...
In my seminal post On Complexity in Big Data I talked about the level of complexity increase with Big Data. The post itself focused on Big Data batch systems. I didn’t really cover real-time complexity increases when dealing with Big Data. In the post, I argue...
I’m often asked how someone who is a consultant how they can get into Big Data. This is an important subject because it will define your success as consultant in the field. More importantly, it will define how successful your customers will be. Learning If Big...