AI Engineering Book Review
I was lucky enough to get a copy of AI Engineering by Chip Huyen from my local library and I've plowed through most if it in the last few weeks since it's overdue with dozens of people hungrily waiting on my copy.
Setting aside her stellar resume it quickly becomes clear from the content alone that you are getting a unique firehose of useful information from an expert in an emerging field.
Understanding foundation models provides a pretty intense overview, and I was proud of myself for feeling like I was keeping up with all the content here outside of the math. Guess all these years of pretending to understand AI research papers I'm reading is finally starting to pay off.
The two chapters on evaluation really made me start to respect the work of a really good AI integration team. Having played at doing a little of this type of work myself and seeing how dramatically app behavior changed between versions of the same model I realize my tiny little list of evals coupled with my own vibes check were not going to get me very far.
Prompt engineering covered a lot of area I already knew well as a longtime power user of AI chat. The section on Defensive Prompt Engineering was particularly strong. The content was succinct enough to feel like you got a good grasp without bogging down in all the interesting details of different types of attacks.
RAG really filled in a lot of gaps for me. My personal projects have always stopped short of this level of sophistication so its nice to learn what it would look like and the effort involved to store and retrieve relevant data in a vector database rather than doing naive string parsing in some text docs.
Finetuning started to make my lack of fundamentals in building traditional models start to show. The author did a very good job explaining why I might want to go about finetuning a model, but i didn't leave this chapter with the same confidence that I could dive right in and make this happen, not by any fault of the author, I just need to go finish reading Applied Machine Learning and AI for Engineers and then revisit this section.
Dataset Engineering is yet another subject I took for granted as being relatively easy and I'm now realizing how complex it is after reading this book. We've moved well past the volume of data being the limitation to the quality of the data and how well its annotated making the difference. Collecting, storing, labeling are all projects in their own right, and if you fail to collect something you won't get another chance to see that data. Add data synthesis and you can easily burn months of engineering time getting this part right. Really hits home that the decision to finetune (or being forced to finetune) can really change your AI engineering project's trajectory.
Inference optimization is not a problem I've ever had to deal with, but similarly to the above sections its fascinating to learn how complex this area can be. In particular reading about Medusa and multi head parallel decoding made me smile. Most engineers understand LLMs as "just a next-token predictor" but that oversimplification may become dated quickly. An interesting example among many of how complexity increases over time and everything is a moving target in LLM development.
The architecture chapter scratches the surface of all the things you might deploy around these models to support them, routers, gateways and caches should all sound familiar to anyone who has done API-driven microservices development. I really like the ever-growing diagram to visualize how each added layer might add to your overall footprint and complexity. The remainder of this chapter gave a really good breakdown of how gathering user feedback is different for language models compared to other products.
The bottom line is that any team that is actively integrating LLMs should immediately buy a stack of these books and hand them out. This book made it clear to me that there is a lot of very real intense software engineering work needed to support a production deployment. The types and scope of work needed to build software systems that incorporate these models is daunting and a constantly moving target. Those who implement the techniques the author presents are going to be the ones who win in the next few years as VCs check in on all their recent AI investments to see if any revenue is actually being generated not just re-distributed to cloud and AI vendors.
I would hope that everyone who has blithely written off someone's hard work as a thin layer on top of ChatGPT will read this and change their tune, but digesting 400+ pages might be too much effort for social media armchair AI experts.
* Footnote: Don't ignore the footnotes! I normally skip footnotes but there is so much useful data and explanations buried in the footnotes here.

