There is a common beginner question for engineers starting out with Big Data. An engineer will do a post to a social media site saying “I need to know which Big Data technology to use. I have 3 billion rows in 10,000 files. The whole dataset is 100 GB. Is Big...
There are two different types of data engineering. There are two different types of job types with the title data engineer. This is especially confusing to organizations and individuals who are starting out learning about data engineering. This confusion leads to the...
One of the benefits of teaching and consulting is the sheer number of organizations, teams, and people I get to work with. Since I deal with so many different groups, I can see patterns emerge much faster than others. One pattern I saw early on was real-time Big Data....
Creating real-time data pipelines bring new challenges. There are new concepts and technologies that you’ll need to learn and understand. To help you understand the basic technologies you need in a real-time data pipeline, I break it down into 4 general types....
The move from batch to real-time Big Data represents change. It will entail using brand new technologies and concepts that you haven’t dealt with before. Batch Big Data Let’s start off by defining batch Big Data. For batch, all data must be there when the...
I wrote a post for the O’Reilly data blog going into my latest thoughts and views on data engineers versus data scientists. I continue on to talk about machine learning engineers. Can you switch careers to Big Data in 4 months or less?If you’re a Software...