Join 📚 Kevin's Highlights
A batch of the best highlights from what Kevin's read, .
When we unpack the common threads of how various people define data engineering, an obvious pattern emerges:
a **data engineer**
*gets data, stores it, and prepares it for consumption*
by **data scientists**, **analysts**, and others.
We define data engineering and data engineer as follows:
**Data engineering** is
the *development*, *implementation*, and *maintenance*
of **systems** and **processes** that take in raw data
and produce high-quality, consistent information
that supports downstream use cases,
such as analysis and machine learning.
**Data engineering** is
the intersection of
*security*,
*data management*,
*DataOps*,
*data architecture*,
*orchestration*, and
*software engineering*.
A **data engineer**
*manages the data engineering lifecycle*,
beginning with getting data from source systems and
ending with serving data for use cases,
such as analysis or machine learning.
Fundamentals of Data Engineering
Reis, Joe;Housley, Matt;
LATCH is an approach for finding your data and what you should put in your notes, so that you can find them later.
- from Richard Saul Wurman's *Information Anxiety*.
LATCH stands for five things,
**L**ocation,
**A**lphabet,
**T**ime,
**C**ategory and
**H**ierarchy.
How to Organize Your Notes in Obsidian // the LATCH Method
Nicole van der Hoeven
• Finding and Understanding the Data
• Cleaning the Data and Feature Engineering
• Tuning and Evaluating
• Using the Model and Presenting Results
The Machine Learning Process
Codecademy
...catch up on these, and many more highlights