In my seminal post On Complexity in Big Data I talked about the level of complexity increase with Big Data. The post itself focused on Big Data batch systems. I didn’t really cover real-time complexity increases when dealing with Big Data. In the post, I argue...
Writing your own distributed system shouldn’t be a task you undertake lightly. Too often, I’m seeing teams create their own distributed system. In my experience, this is because they don’t know or think about all of the ramifications of creating...
You’re starting to learn about Big Data or you’re wanting to learn more about Big Data. You start off by googling ‘what is Big Data?’ You get an answer that doesn’t quite make sense. The site talks about 3 Vs or sometimes they’re 4...
As I’ve worked with software teams, I’ve found some interesting views on distributed systems. Some teams think they’re creators of distributed systems. They usually aren’t. I think there are three main groups of teams that interact with...
In my book, Data Engineering Teams, I talk about the right skills and people to be on a data engineering team. The right skills and people are incredibly important to the success, or failure, of a Big Data project. Sometimes it’s easier to understand this point...
Sometimes I’ll write a post and the comments will say something to the effect of “this is useless.” Other times I’ll be finishing up a class and a student will ask me why I didn’t cover what they’re trying to. I’ve written...