Strategy for Mastering AI in 2024


So, your aim is to dive into AI studies? Yet you are unsure about how or where to begin?

In 2020, I shared a list of the Top 20 free Data Science, ML, and AI MOOCs available online. However, I have come to realize that enrolling in numerous courses is not the most effective approach.

To break free from the cycle of endless tutorials and truly gain expertise, you need to engage in practical applications, develop algorithms from scratch, implement research papers, and undertake engaging AI projects to solve real-world issues.

This piece endeavors to create a complimentary curriculum that aligns with this ideology. I am actively pursuing some of these courses, hence feel free to get in touch on Twitter or Linkedin if you wish to learn alongside me!

Moreover, please drop a comment if you identify any omissions!

Before delving into specifics, there are certain insights to consider regarding the curriculum and some guidance on the learning process.

Starting from the Top-down

This curriculum adopts a top-down methodology — initially focus on coding, theory later.

I prefer learning out of necessity. Whenever there is a problem to address, a solution to devise, or a prototype to construct, I will extensively search for the essential information, comprehend it, and subsequently act upon it.

For instance, my ambition is to evolve into an AI engineer well-versed in LLMs at a foundational level. This entails the ability to code transformers from the ground up and fine-tune LLMs on GPUs. Currently, I lack this proficiency due to gaps in my knowledge, which I am determined to bridge.

This curriculum primarily focuses on NLP; if you are interested in other AI specializations like computer vision or reinforcement learning, drop a comment below or directly message me on Twitter or Linkedin. I will provide you with some recommended resources.

Prior to inundating you with a plethora of links, I wish someone had emphasized two critical aspects I should have been aware of before embarking on the learning journey.

Educating in a Public Domain

The realm of knowledge is vast and continuous learning is imperative, especially in the field of AI where groundbreaking research and concepts emerge on a weekly basis.

The gravest mistake one can make is to confine learning to private spheres. By doing so, you limit opportunities for growth. Mere completion of tasks lacks significance. What holds greater value is how you assimilate the information, transform it into knowledge for dissemination, and conceive innovative solutions and ideas based on that knowledge.

Hence, it is vital to educate in a public space.

This can entail:

  • drafting blogs and tutorials
  • participating in hackathons and collaborating with peers
  • asking and addressing queries in Discord communities
  • embarking on projects of personal interest
  • sharing newfound discoveries on Twitter

Moreover, when it comes to Twitter,

Employing Twitter as a Tool

When used effectively and in accordance with the right individuals, Twitter proves to be the most valuable social platform in contemporary times.

Whom to follow? Check out this curated AI list by Suhail.

Utilizing Twitter? Refer to Near’s guide on Effective Twitter Practices.

Reach out to individuals on Twitter. Be genuine, concise, and articulate your requirements clearly. This guide on Crafting Cold Emails by Sriram Krishnan is also applicable to Direct Messages.

Tweet Composition? Refer to Elements of a Tweet by Jason, the developer of Instructor, who accumulated 14k followers within months.

If you are perusing this article, do follow me on Twitter!

Contact me regarding your ongoing projects! Collaboration on intriguing endeavors always piques my interest.

Let’s now delve into the details.

Contents Overview

Foundations in Mathematics

DALL·E

Machine learning heavily relies on three core mathematical pillars: linear algebra, calculus, as well as probability and statistics. Each of these elements plays an essential role in facilitating the optimal performance of algorithms.

  • Linear Algebra: serves as the mathematical framework for data management and manipulation, with matrices and vectors acting as the primary medium for algorithmic interpretation and information processing
  • Calculus: serves as the driving force for optimization in machine learning, enabling algorithms to learn and refine by comprehending gradients and rate variations
  • Probability and Statistics: provide the fundamental principles for decision-making amidst uncertainty, enabling algorithms to forecast outcomes and learn from data through probabilistic and variance models

A phenomenal series on Math for Machine Learning from a developer’s viewpoint can be explored here by Weights & Biases.

source code)

If you prefer a code-centric approach to Linear Algebra, explore Computational Linear Algebra (video series, source code) shared by the developers of fast.ai.

Follow Introduction to Linear Algebra for Applied Machine Learning with Python in parallel with the course.

If you desire a more conventional approach, check out the lectures at Imperial College LondonLinear Algebra & Multivariate Calculus.

View 3Blue1Brown’s Essence of Linear Algebra and Essence of Calculus.

Observe Statistics Fundamentals by StatQuest for statistics

Additional Resources

Resources

DALL·E

Python

Beginners can start here: Practical Python Programming.

If you are already adept in Python, consider Advanced Python Mastery

Both are exceptional courses from David Beazley, the creator of Python Cookbook.

Next, delve into James Powell’s presentations

Explore Python Design Patterns.

Additional Resources

PyTorch

Watch PyTorch Tutorials by Aladdin Persson

The PyTorch website offers valuable content.

Challenge yourself with some puzzles

Additional Resources

Machine Learning

DALL·E

Refer to the 100-page ML book.

Develop from Scratch

While studying, implement the algorithms from scratch.

Check out the repositories listed below

For a challenging task, build PyTorch from scratch by following this course.

Participate in Competitions

Apply your knowledge in competitive scenarios.

Engage in Side Projects

Explore Bringing machine learning to production by Vicki Boykis

She also documented her experience building Viberary, a semantic book search engine.

Obtain a dataset and develop a model (e.g., leverage earthaccess for NASA Earth data).

Construct a user interface using streamlit and share it on Twitter.

Deploy the Models

Put the models into production. Track your experiments. Learn to monitor models. Experience data and model drift firsthand.

Explore these valuable resources

Additional Resources

Deep Learning

For those interested in a top-down approach, begin with fast.ai.

Fast.ai

If you enjoyed fast.ai, explore Full Stack Deep Learning.

For a more in-depth and conventional course, consider checking out UNIGE 14×050 — Deep Learning by François Fleuret.

If you need to delve into theory later on, these are excellent resources.

Read The Little Book of Deep Learning on your mobile instead of scrolling through Twitter.

Read these while your neural networks are progressing.

Engage in more competitions

Implement research papers

Explore labml.ai Annotated PyTorch Paper Implementations

Papers with Code serve as a valuable resource; check out BERT explained on their platform.

Below are some references for specific areas within Deep Learning

Computer Vision

Many individuals suggest the CS231n: Deep Learning for Computer Vision course. It’s demanding but rewarding if you persevere.

Reinforcement Learning

For RL enthusiasts, these resources are excellent:

NLP

Another standout Stanford course, CS 224N | Natural Language Processing with Deep Learning

Get familiar with Hugging Face: Hugging Face NLP Course

Explore this Super Duper NLP Repo

Informative articles and breakdowns

Additional resources

Large Language Models

Start by watching [1hr Talk] Intro to Large Language Models by Andrej.

Then check out Large Language Models in Five Formulas, by Alexander Rush — Cornell Tech

Watch Neural Networks: Zero to Hero

It commences with elucidating and coding backpropagation from the ground up and culminates in developing GPT from scratch.

Neural Networks: Zero To Hero by Andrej Karpathy

Andrej recently shared a new video → Let’s build the GPT Tokenizer

You may also want to explore GPT in 60 Lines of NumPy | Jay Mody during this process.

Free LLM boot camp

A complimentary LLM Bootcamp released by Full Stack Deep Learning.

This boot camp covers prompt engineering, LLMOps, UX for LLMs, and guidance on launching an LLM app within an hour.

After completing this boot camp and feeling eager to build,

Develop with LLMs

Interested in creating applications with LLMs?

Watch Application Development using Large Language Models
by Andrew Ng

Read Building LLM applications for production by Huyen Chip

As well as Patterns for Building LLM-based Systems & Products by Eugene Yan

Consult the OpenAI Cookbook for practical guides.

Utilize Vercel AI templates to kickstart your projects.

Join hackathons

lablab.ai hosts fresh AI hackathons weekly. Feel free to reach out if you’re interested in teaming up!

If you wish to delve deeper into the theoretical aspects and grasp the functioning of everything:

Read papers

An exceptional articleby Sebastian Raschka on Understanding Large Language Models, presenting a compilation of papers you must peruse.

In a recent publication, he also shared another article featuring papers for your consideration in January 2024, focusing on mistral models.

Explore his substack called Ahead of AI.

Creating Transformers from the Ground Up.

Peruse The Transformer Family Version 2.0 | Lil’Log for a concise summary.

Select the preferred format and establish it from the start.

Documentation

Articles

Visuals

You are now equipped to build transformers from scratch. Nonetheless, more awaits.

View these Stanford CS25 — Transformers United videos.

Insightful Blog Posts

View Umar Jamil

He offers detailed video insights on papers, including accompanying code demonstrations.

Here are additional resources related to LLMs that are by no means exhaustive. Refer to the LLM Syllabus for a detailed overview of LLMs.

Mastering the Execution of Open-Source Models.

Utilize ollama: Get up and running with Llama 2, Mistral, and other large language models locally

Python & JavaScript Libraries were recently launched here

Engaging in Prompt Engineering

Consult Prompt Engineering | Lil’Log

ChatGPT Prompt Engineering for Developers by Ise Fulford (OpenAI) and Andrew Ng

Delve into other brief courses available for free at DeepLearning.ai.

Enhancing LLMs through Fine-Tuning

Refer to the Hugging Face fine-tuning guide.

Helpful guidance: Fine-Tuning — The GenAI Guidebook

Discover axolotl.

Benefit from this informative article: Fine-tune a Mistral-7b model with Direct Preference Optimization | by Maxime Labonne

RAG

An exceptional piece by Anyscale: Building RAG-based LLM Applications for Production

An in-depth analysis of Retrieval Augmented Generation by Aman Chadha

AI Education Resources

Other Curriculums/Listicles to Explore

While my list is not exhaustive, if you desire more resources, here are a few suggestions.

I trust this will aid you on your AI expedition!

Click