Join 📚 Kevin's Highlights
A batch of the best highlights from what Kevin's read, .
Here’s what the entity resolution query looks like:

I’m joining the table with itself on state + zipcode
to reduce the search space and
using string similarity thresholds
for filtering potential duplicates.
In entity resolution methodology this is known as “blocking.”
Fundamental Data Engineering Concepts - Part 2
Ergest Xheblati
You likely know what this means
even if you don’t know it in those words.
The hype cycle, as defined by Gartner, [which tracks it](https://www.gartner.com/en/chat/gartner-hype-cycle),
is that series of cyclical events
that happens around nearly all emerging technologies:
the breakthrough,
the “peak of inflated expectations,”
the disillusionment,
the period of actual serviceable uses of the tech,
and the time when it’s adopted.
That pinnacle is the groan time,
the moment Justin Bieber drops more than $1 million
on an NFT.
The moment Facebook buys Oculus.
The moment the bodega starts taking bitcoin
and you know you’ll never be able to escape this thing,
whatever it is.
This Is the Worst Part of the AI Hype Cycle
Angela Watercutter
it’s important to realize that **ChatGPT and LaMDA aren’t trained to be correct**.
You can train models that are optimized to be correct—but
that’s a different kind of model.
Models like that are being built now;
they tend to be smaller and trained on specialized data sets
(O’Reilly Media has a search engine that has been trained on the 70,000+ items in our learning platform).
And you could integrate those models with GPT-style language models, so that
one group of models supplies the *facts* and
the other supplies the *language*.
Sydney and the Bard
Mike Loukides
...catch up on these, and many more highlights