Great list. As a newcomer to this field, I too, started out watching YouTube videos and shorts for "trendy ML concepts" but they get overwhelming. I found courses (I've used YouTube playlists or Udemy) to be more structured, and books are the ultimate source of detailed explanations and they build up chapter wise.
I'd like to mention these books that gave me a good introduction to these new AI/ML topics.
- The Illustrated Transformer (Jay)
- LLM Engineer's Handbook (Paul)
- AI Engineering (Chip)
I'm currently going through blogs for more hands-on projects, and that cover conversational topics and help to keep upto date with new trends. Its good to see that many contents are now focusing on production ready applications and best practices. Looking forward to reading your post on content creators for AI-ML content.
"First things first, press the snooze button on the hype distractions.
Mute everything that’s not practical:
Ignore the “influencers” who buzz your feed with tag lines such as “KILLER FEATURE”, “GAME CHANGING”, “HUGE.”
Ignore the memes
Ignore every fresh model and paper release; you won’t keep up.
Ignore the “use these 10000x tools to master AI” messages.
The brain is not wired up to keep track of constant impulses and frequent information overload."
I was just contemplating this morning writing a Substack post on how I'm dealing with AI, and that paragraph covers one of the points I was going to make.
So maybe I'll just quote you! (With a link to your piece, of course. In fact, I'm going to restack it.)
My favorite BS line from the Ai "influencers" is this one: [Some tool/new LLM] JUST STUNNED THE INDUSTRY!" I don't know how many times I read that line during 2024.
I honestly grew tired of typical BS AI influencer stuff, got bored unfollowing people who promote that. The one thing i dislike the most is how "AI expert" everyone has become, people with 0 experience in the field, selling courses, bootcamps in the order of 1000$+, teaching you how to use all these "NEW TOOLS". Total Dunning-Kruger effect at play there :))
This post is eye opener for me. As you said I was totally distracted by hype. At the end i feel confused and stuck. This article is really helpful for me. Thanks Alex!. You are the person i really needed in my ml journey. I would be very much pleased if I would get get your mentorship Alex. I really needed it. Once again thank you Alex.
The notion "Learning is not supposed to be fun" aligns with me a lot. Although, imo, When you are learning it feels like hell because you are literally breaking some neural pathways and creating new ones. And the body does resist the change (inertia). But once you understand what you are learning, it does seem like the hell you went through, was worth it.
The books list is also good with the score.
A suggestion:
If you have read "AI engineering" by Chip Huyen, then that also should be there in here, I think. This is one book that kept me grounded during the AI hype. As your blog talks about cutting through the hype, I think it should be added.
Yes, thank you for your suggestion, completely with you on that!
Chip's book is on my list; I didn't add it here because it's geared towards building with Foundational Models, and in that regard, I'll write another article with a rich set of resources that focuses specifically on that subfield.
I have read half of it(AI engineering). Not yet completed it. Is there something about foundational models too? I was of this notion that it only covered the best practices to build AI app,s what are the questions you should ask before building AI apps and some simple example AI apps to prove the point. Just like her previous book (Desiginig ML sys book) which was less code-intensive and more best practices intense approach .
The thing is, I read the initial few chaps of a book in one sitting only ie 5-6 hours with frequent 30 -40 mins break in between. This tells me that the books speaks to me and i do like to know more what is written. As i forget some of the things, I read in previous sittings, I re-read or skim through what already read. I think I have to re visit it again.
Yes, Chip's AI Engineering book covers foundational models and how to build applications on top of FMs.
Since FMs are inherently different than traditional deep learning models, practices in building AI apps also differ as you now have security, human-in-the-loop, vector databases, LLMOps, and other components.
I would love it if you could share, in an article, your insights on the learning schedule and the AI Engineering book, I think other people would be interested in that too, what do you think?
Hidden Markov Models can be used to generate a language, that is, list elements from a family of strings. For example, if you have a HMM that models a set of sequences, you would be able to generate members of this family, by listing sequences that would fall into the group of sequences we are modelling.
Neural Networks, take an input from a high-dimensional space and simply map it to a lower dimensional space (the way that the Neural Networks map this input is based on the training, its topology and other factors). For example, you might take a 64-bit image of a number and map it to a true / false value that describes whether this number is 1 or 0.
Whilst both methods are able to (or can at least try to) discriminate whether an item is a member of a class or not, Neural Networks cannot generate a language as described above.
There are alternatives to Hidden Markov Models available, for example you might be able to use a more general Bayesian Network, a different topology or a Stochastic Context-Free Grammar (SCFG) if you believe that the problem lies within the HMMs lack of power to model your problem - that is, if you need an algorithm that is able to discriminate between more complex hypotheses and/or describe the behaviour of data that is much more complex.
What is hidden and what is observed: The thing that is hidden in a hidden Markov model is the same as the thing that is hidden in a discrete mixture model, so for clarity, forget about the hidden state's dynamics and stick with a finite mixture model as an example. The 'state' in this model is the identity of the component that caused each observation. In this class of model such causes are never observed, so 'hidden cause' is translated statistically into the claim that the observed data have marginal dependencies which are removed when the source component is known. And the source components are estimated to be whatever makes this statistical relationship true. The thing that is hidden in a feedforward multilayer neural network with sigmoid middle units is the states of those units, not the outputs which are the target of inference. When the output of the network is a classification, i.e., a probability distribution over possible output categories, these hidden units values define a space within which categories are separable. The trick in learning such a model is to make a hidden space (by adjusting the mapping out of the input units) within which the problem is linear. Consequently, non-linear decision boundaries are possible from the system as a whole.
Generative versus discriminative: The mixture model (and HMM) is a model of the data generating process, sometimes called a likelihood or 'forward model'. When coupled with some assumptions about the prior probabilities of each state you can infer a distribution over possible values of the hidden state using Bayes theorem (a generative approach). Note that, while called a 'prior', both the prior and the parameters in the likelihood are usually learned from data. In contrast to the mixture model (and HMM) the neural network learns a posterior distribution over the output categories directly (a discriminative approach). This is possible because the output values were observed during estimation. And since they were observed, it is not necessary to construct a posterior distribution from a prior and a specific model for the likelihood such as a mixture. The posterior is learnt directly from data, which is more efficient and less model dependent.
Mix and match: To make things more confusing, these approaches can be mixed together, e.g. when mixture model (or HMM) state is sometimes actually observed. When that is true, and in some other circumstances not relevant here, it is possible to train discriminatively in an otherwise generative model. Similarly it is possible to replace the mixture model mapping of an HMM with a more flexible forward model, e.g., a neural network.
Alex as part of curriculum I have to write a review paper I took a topic in RAG domain , while reading papers on frameworks there were words like processing modules , pipelines , modular design and things underline , these words seem new to me and I am using ai to understand .for every para there is so much to deep dive and understand . I dont think i am ready for a review paper on rag I am new Ik I have a long learning journey to cover yet I am looking forward for you to guide me .
Curiosity should be driver. Learning and retention is just a side effect. Thats how you make it fun 🙂
100%, I've got nothing to add to that statement!
Love this post man!!!
Thanks, man! There's so much noise today around ML and too few people who educate and point newcomers in the right direction!
Great list. As a newcomer to this field, I too, started out watching YouTube videos and shorts for "trendy ML concepts" but they get overwhelming. I found courses (I've used YouTube playlists or Udemy) to be more structured, and books are the ultimate source of detailed explanations and they build up chapter wise.
I'd like to mention these books that gave me a good introduction to these new AI/ML topics.
- The Illustrated Transformer (Jay)
- LLM Engineer's Handbook (Paul)
- AI Engineering (Chip)
I'm currently going through blogs for more hands-on projects, and that cover conversational topics and help to keep upto date with new trends. Its good to see that many contents are now focusing on production ready applications and best practices. Looking forward to reading your post on content creators for AI-ML content.
I’m taking IBM course on coursera, what do you think about it?
Nice post! That’s what I am trying to do at machine learning at scale :)!
Thanks, man!
Happy to hear that, I've read your latest article LLM Serving (4), great series btw - pure tech, no hype! 🔥
Happy to hear you like it :)
You cleared the confusion. Awesome post.
Thanks for the feedback! Glad to hear that 🔥
I really liked that Chip Huyen book.
100% me too! that book helped me learn and understand the fundamentals of ML Systems, really great read!
Solid recommendations
Thanks man, glad you liked it! 🔥
Nice stack mate
Thanks man 🔥
"First things first, press the snooze button on the hype distractions.
Mute everything that’s not practical:
Ignore the “influencers” who buzz your feed with tag lines such as “KILLER FEATURE”, “GAME CHANGING”, “HUGE.”
Ignore the memes
Ignore every fresh model and paper release; you won’t keep up.
Ignore the “use these 10000x tools to master AI” messages.
The brain is not wired up to keep track of constant impulses and frequent information overload."
I was just contemplating this morning writing a Substack post on how I'm dealing with AI, and that paragraph covers one of the points I was going to make.
So maybe I'll just quote you! (With a link to your piece, of course. In fact, I'm going to restack it.)
My favorite BS line from the Ai "influencers" is this one: [Some tool/new LLM] JUST STUNNED THE INDUSTRY!" I don't know how many times I read that line during 2024.
Happy to hear we share the same thought!
I honestly grew tired of typical BS AI influencer stuff, got bored unfollowing people who promote that. The one thing i dislike the most is how "AI expert" everyone has become, people with 0 experience in the field, selling courses, bootcamps in the order of 1000$+, teaching you how to use all these "NEW TOOLS". Total Dunning-Kruger effect at play there :))
This post is eye opener for me. As you said I was totally distracted by hype. At the end i feel confused and stuck. This article is really helpful for me. Thanks Alex!. You are the person i really needed in my ml journey. I would be very much pleased if I would get get your mentorship Alex. I really needed it. Once again thank you Alex.
Happy to hear that Sai!
Feel free to message me privately If you'd like a specific topic for me to dive into
The notion "Learning is not supposed to be fun" aligns with me a lot. Although, imo, When you are learning it feels like hell because you are literally breaking some neural pathways and creating new ones. And the body does resist the change (inertia). But once you understand what you are learning, it does seem like the hell you went through, was worth it.
The books list is also good with the score.
A suggestion:
If you have read "AI engineering" by Chip Huyen, then that also should be there in here, I think. This is one book that kept me grounded during the AI hype. As your blog talks about cutting through the hype, I think it should be added.
What a comment, I love it!
Yes, thank you for your suggestion, completely with you on that!
Chip's book is on my list; I didn't add it here because it's geared towards building with Foundational Models, and in that regard, I'll write another article with a rich set of resources that focuses specifically on that subfield.
Thank you for your feedback, Nayak
I have read half of it(AI engineering). Not yet completed it. Is there something about foundational models too? I was of this notion that it only covered the best practices to build AI app,s what are the questions you should ask before building AI apps and some simple example AI apps to prove the point. Just like her previous book (Desiginig ML sys book) which was less code-intensive and more best practices intense approach .
The thing is, I read the initial few chaps of a book in one sitting only ie 5-6 hours with frequent 30 -40 mins break in between. This tells me that the books speaks to me and i do like to know more what is written. As i forget some of the things, I read in previous sittings, I re-read or skim through what already read. I think I have to re visit it again.
Yes, Chip's AI Engineering book covers foundational models and how to build applications on top of FMs.
Since FMs are inherently different than traditional deep learning models, practices in building AI apps also differ as you now have security, human-in-the-loop, vector databases, LLMOps, and other components.
I would love it if you could share, in an article, your insights on the learning schedule and the AI Engineering book, I think other people would be interested in that too, what do you think?
Thanks for the encouragement ! It really means a lot!
I will surely do that. I think I will share the learning schedule first then the notes about the books as a follow-up.
Till then I'm going through the Book Review by YT channel "The AI Engineer"
https://youtu.be/U8tC0l06cFQ?feature=shared
And once done, you will be the first to know. Can we connect on Linkedin? I have sent you a request.
I really align with the points mentioned in the video below:
How to take notes:
https://youtu.be/ATmJb3bH2E0?feature=shared
How to read and absorb what you are reading:
https://youtu.be/uiNB-6SuqVA?feature=shared
Very useful 👍
Thanks, Adrian; I'm glad you've enjoyed it. Don't forget to vote in the poll :)) - I'll appreciate your feedback. 🙏
Hidden Markov Models can be used to generate a language, that is, list elements from a family of strings. For example, if you have a HMM that models a set of sequences, you would be able to generate members of this family, by listing sequences that would fall into the group of sequences we are modelling.
Neural Networks, take an input from a high-dimensional space and simply map it to a lower dimensional space (the way that the Neural Networks map this input is based on the training, its topology and other factors). For example, you might take a 64-bit image of a number and map it to a true / false value that describes whether this number is 1 or 0.
Whilst both methods are able to (or can at least try to) discriminate whether an item is a member of a class or not, Neural Networks cannot generate a language as described above.
There are alternatives to Hidden Markov Models available, for example you might be able to use a more general Bayesian Network, a different topology or a Stochastic Context-Free Grammar (SCFG) if you believe that the problem lies within the HMMs lack of power to model your problem - that is, if you need an algorithm that is able to discriminate between more complex hypotheses and/or describe the behaviour of data that is much more complex.
What is hidden and what is observed: The thing that is hidden in a hidden Markov model is the same as the thing that is hidden in a discrete mixture model, so for clarity, forget about the hidden state's dynamics and stick with a finite mixture model as an example. The 'state' in this model is the identity of the component that caused each observation. In this class of model such causes are never observed, so 'hidden cause' is translated statistically into the claim that the observed data have marginal dependencies which are removed when the source component is known. And the source components are estimated to be whatever makes this statistical relationship true. The thing that is hidden in a feedforward multilayer neural network with sigmoid middle units is the states of those units, not the outputs which are the target of inference. When the output of the network is a classification, i.e., a probability distribution over possible output categories, these hidden units values define a space within which categories are separable. The trick in learning such a model is to make a hidden space (by adjusting the mapping out of the input units) within which the problem is linear. Consequently, non-linear decision boundaries are possible from the system as a whole.
Generative versus discriminative: The mixture model (and HMM) is a model of the data generating process, sometimes called a likelihood or 'forward model'. When coupled with some assumptions about the prior probabilities of each state you can infer a distribution over possible values of the hidden state using Bayes theorem (a generative approach). Note that, while called a 'prior', both the prior and the parameters in the likelihood are usually learned from data. In contrast to the mixture model (and HMM) the neural network learns a posterior distribution over the output categories directly (a discriminative approach). This is possible because the output values were observed during estimation. And since they were observed, it is not necessary to construct a posterior distribution from a prior and a specific model for the likelihood such as a mixture. The posterior is learnt directly from data, which is more efficient and less model dependent.
Mix and match: To make things more confusing, these approaches can be mixed together, e.g. when mixture model (or HMM) state is sometimes actually observed. When that is true, and in some other circumstances not relevant here, it is possible to train discriminatively in an otherwise generative model. Similarly it is possible to replace the mixture model mapping of an HMM with a more flexible forward model, e.g., a neural network.
Alex as part of curriculum I have to write a review paper I took a topic in RAG domain , while reading papers on frameworks there were words like processing modules , pipelines , modular design and things underline , these words seem new to me and I am using ai to understand .for every para there is so much to deep dive and understand . I dont think i am ready for a review paper on rag I am new Ik I have a long learning journey to cover yet I am looking forward for you to guide me .