The hottest Deep Learning Substack posts right now

And their main takeaways

RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data contamination

Democratizing Automation • 213 implied HN points • 22 Nov 23

🕹 Technology Deep Learning

Reinforcement learning from human feedback (RLHF) is a technology that is still unknown and undocumented.
Scaling DPO to 70B parameters showed strong performance by directly integrating the data and using lower learning rates.
DPO and PPO have differences in their approaches, with DPO showing potential for enhancing chat evaluations and happy users of Tulu and Zephyr models.

For the Love of PyTorch

Sector 6 | The Newsletter of AIM • 39 implied HN points • 04 Sep 23

🕹 Technology Deep Learning

PyTorch is a key player in the development of AI, particularly large language models (LLMs). Its flexibility makes it great for deep learning experiments.
The framework supports GPUs really well and allows for easy updates to computation graphs during programming.
In 2022, PyTorch had a significant edge on platforms like Hugging Face, with 92% of models being PyTorch-exclusive compared to just 8% for TensorFlow.

Deep learning for single-cell sequencing: a microscope to see the diversity of cells

The Gradient • 144 implied HN points • 13 Jan 24

🔬 Science Deep Learning

Deep Learning is a key enabler for advancing single-cell sequencing technologies.
Single-cell sequencing allows us to explore the diversity of individual cells at a detailed level.
Tools using Deep Learning techniques are pivotal in analyzing single-cell RNA sequencing data.

The Good A.I. We’re Not Talking About

The Digital Anthropologist • 19 implied HN points • 04 Jan 24

🕹 Technology Deep Learning

Artificial Intelligence (AI) is not just about Generative AI (GAI) like ChatGPT. There are various other proven AI tools like Machine Learning (ML), Deep Learning, Natural Language Processing (NLP), and Expert Systems being successfully used in industries such as healthcare, manufacturing, and more.
AI tools have been around for decades and have shown significant positive impacts on society. Despite the hype around GAI, it remains a small part of the broader AI landscape.
Beyond the flashy headlines, many AI applications are working behind the scenes in specialized industries, quietly making a positive difference. While GAI is getting attention, the real-world impact of other AI tools continues to be substantial.

The Sequence Knowledge #463: Wrapping Up our Series About Knowledge Distillation: Pros and Cons

TheSequence • 35 implied HN points • 07 Jan 25

🕹 Technology Deep Learning

Knowledge distillation is a method where a smaller model learns from a larger, more complex model. This helps make the smaller model efficient while retaining essential features.
The series covered different techniques and challenges in knowledge distillation, highlighting its importance in machine learning and AI development. Understanding these can help when deciding if this approach is suitable for your projects.
It's useful to be aware of both the benefits and drawbacks of knowledge distillation. This helps in figuring out the best way to implement it in real-world applications.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

The Bias vs Variance Tradeoff [Math Mondays]

Technology Made Simple • 39 implied HN points • 06 Dec 22

🕹 Technology Deep Learning

Understanding the Bias-Variance Tradeoff is crucial in Data Science and Machine Learning.
Bias in a Machine Learning Model refers to prediction errors, while Variance accounts for the spread in predictions.
High Bias can lead to underfitting, where the model doesn't grasp the data pattern fully, while High Variance can result in overfitting, where the model learns noise in the data.

S. Somasegar on the Present and Future of Generative AI

TheSequence • 161 implied HN points • 15 Mar 23

🕹 Technology Deep Learning

Generative AI is a subsegment of intelligent applications with potential in enterprise and consumer use cases.
Developer tools will be reimagined with foundation models, enhancing productivity and code quality.
New capabilities in generative AI models include the use of 'agents' for natural language interpretation and actions.

AI Stores

Yuxi’s Substack • 19 implied HN points • 15 Feb 23

🕹 Technology Deep Learning

We are entering the era of AI Stores.
An AI Store provides general AI capabilities like drafting emails, drawing, and suggesting software code.
Contributing to or benefiting from AI Stores can range from being a customer to fine-tuning models based on resources.

Everything you need to know about geometric deep learning

Three Data Point Thursday • 19 implied HN points • 22 Jun 23

🕹 Technology Deep Learning

You should be using alternative data.
Avoid using geometric deep learning unless you're a data entrepreneur.
If you're already building something, flatten your data instead of using GDL.

Fourteen Podcast Episodes about AI

Mike Talks AI • 19 implied HN points • 27 Apr 23

🕹 Technology Deep Learning

Recommended AI podcast episodes cover topics like AI safety, self-driving cars, and deep learning.
Podcasts like 'My First Million' and 'a16z' offer insights on AI in entrepreneurship and the creator economy.
Diverse range of podcasts explore AI applications in fields like image recognition, sensor data analysis, and deep learning models.

The Birth of Baby Llama

Sector 6 | The Newsletter of AIM • 19 implied HN points • 25 Jul 23

🕹 Technology Deep Learning

Andrej Karpathy worked on a fun project to create a smaller version of the Llama 2 model called Baby Llama. It's designed to run on a single computer.
The Baby Llama can load and use the models released by Meta, making it more accessible for users.
Karpathy shared that the performance is promising, with potential for faster processing speeds on a cloud setup.

Bayesian Thinking for Software Engineering [Math Mondays]

Technology Made Simple • 59 implied HN points • 03 May 22

🕹 Technology Deep Learning

Bayes Theorem allows us to update beliefs based on evidence, crucial for software developers making decisions.
Bayesian Thinking is implicit in many decisions we make, and recognizing its importance can prevent fallacies.
Learning Bayesian Thinking involves understanding intuition behind the math, using resources like StatsQuest and 3Blue1Brown.

Two years later, deep learning is still faced with the same fundamental challenges

Marcus on AI • 61 HN points • 10 Mar 24

🕹 Technology Deep Learning

Deep learning still faces fundamental challenges after two years - progress made, but not in all areas.
Obstacles to general intelligence persist despite advancements like GPT-4 and Sora.
Scaling in deep learning hasn't solved issues like genuine comprehension; there's acknowledgment of a potential plateau in AI innovation.

NVIDIA and the battle for the future of Generative AI

Sector 6 | The Newsletter of AIM • 39 implied HN points • 07 Nov 22

🕹 Technology Deep Learning

NVIDIA released a new AI model called eDiffi that creates better images than existing tools like DALL.E 2 and Stable Diffusion. This shows they are making strides in generative AI technology.
In 2022, there was a prediction about NVIDIA launching text-to-image models, and eDiffi is finally their answer to that anticipation. It signifies a new chapter for creative AI tools.
NVIDIA's previous tool, GauGAN, allowed sketches to become realistic landscapes, and now they are advancing to text-based inputs with eDiffi. This represents a move toward more versatile and user-friendly AI innovations.

Newsletter 21: To keepdims or not to keepdims!

Decoding Coding • 1 HN point • 19 Jul 24

🕹 Technology Deep Learning

Understanding the 'keepdims' parameter in tensor operations is important for getting correct results in PyTorch. If you set 'keepdims' to True, the dimensions are preserved, which helps with broadcasting correctly.
When summing tensors, if 'keepdims' is False, it can lead to incorrect calculations because the tensor's shape changes. This can result in dividing values incorrectly, leading to unexpected outputs.
It's crucial to be careful with tensor shapes and broadcasting rules in machine learning models. Even a small oversight can cause models to produce wrong predictions, so always double-check these details.

Learning from the biggest Machine Learning Research YouTuber [Storytime Saturdays]

Technology Made Simple • 19 implied HN points • 04 Dec 22

🕹 Technology Deep Learning

Creating content for a niche audience should focus on solving personal problems rather than trying to be the 'best'.
In the realm of Machine Learning, it's more effective to cover what personally interests you rather than what is considered standard or important by others.
Understanding and dealing with biases in large ML models like Stable Diffusion and GPT-3 is crucial in harnessing their capabilities while mitigating potential pitfalls.

Why Deep Learning is everywhere [Math Mondays]

Technology Made Simple • 19 implied HN points • 25 Oct 22

🕹 Technology Deep Learning

Deep Learning is a subset of Machine Learning that uses Neural Networks with many layers, introducing non-linearity in functions which is crucial for its success.
Deep Networks work well because they can approximate any continuous function by combining non-linear functions, allowing them to tackle complex problems.
The widespread use of Deep Learning is driven by its trendiness and efficiency, appealing to many due to its ability to provide results without extensive data analysis or training.

Gradient Flow #42: Data Quality; Oscilloscope for Deep Learning; Feature Stores

Gradient Flow • 39 implied HN points • 26 Aug 21

🕹 Technology Deep Learning

Data quality is crucial in machine learning and new tools like feature stores are emerging to improve data management.
Experts are working on auditing machine learning models to address issues like discrimination and bias.
Large deep learning models such as Jurassic-1 Jumbo with 178B parameters are being made available for developers.

Data Science Weekly - Issue 458

Data Science Weekly Newsletter • 19 implied HN points • 01 Sep 22

🕹 Technology Deep Learning

Machine learning best practices are shared in a guide from Google, helping those with some knowledge to improve their skills.
There's skepticism about deep learning promises, as experts continue to predict big changes that haven't happened yet.
AI is being used creatively, like generating art from Bible stories, which showcases the potential of technology in different fields.

Data Science Weekly - Issue 453

Data Science Weekly Newsletter • 19 implied HN points • 28 Jul 22

🕹 Technology Deep Learning

Creating a focused GitHub repository can help others in the field, like those working with satellite images and deep learning.
There are unique Python packages available that can enhance your data workflow, making tasks easier and more efficient.
Understanding the technology behind AI and how to use it effectively is crucial for building better models and systems.

Are all hard technical problems AI problems now?

Perceptions • 35 implied HN points • 17 Feb 23

🕹 Technology Deep Learning

AI has made significant progress in solving complex technical problems in various domains.
Many technical problems can be boiled down to optimization/minimization challenges, which AI is well-equipped to handle.
The advancement in AI technology raises questions about the future of work, centralization, and the impact on different professions.

Software²

The Gradient • 29 implied HN points • 22 Apr 23

🕹 Technology Deep Learning

AI research is shifting focus from 'learning from data' to 'learning what data to learn from'.
State-of-the-art deep learning models are becoming data sponges capable of modeling immense amounts of data.
Future AI research trends may emphasize data collection and generation to improve model performance.

How the field of "AI" got like this

Apperceptive (moved to buttondown) • 20 implied HN points • 02 Nov 23

🕹 Technology Deep Learning

The field of AI can be hostile to individuals who are not white men, which hinders progress and innovation.
The history of AI showcases past failures and the subsequent shift towards more practical, engineering-focused approaches like machine learning.
Success in the AI field is heavily reliant on performance advancements on known benchmarks, emphasizing practical engineering solutions.

Forward vs Backward Differentiation

The Palindrome • 2 implied HN points • 16 Jul 25

🕹 Technology Deep Learning

Neural networks can be trained effectively because of vectorization, which allows many calculations to happen at the same time.
Gradient descent helps in optimizing complex functions by finding the best path for improvement in training.
Backpropagation is a method that calculates the necessary adjustments for minimizing error, making the training process more efficient.

Faster LLMs, Safer Chains of Thought, and Image Tokenization Reinvented

ppdispatch • 2 implied HN points • 18 Jul 25

🕹 Technology Deep Learning

There's a new book that helps people understand deep learning in a clear way. It covers important topics like neural networks and how they work.
A new technique called Chain-of-Thought Monitorability may help keep AI safe by watching how AI reasons with language. But it’s still seen as a bit weak and needs more work.
Researchers found that recent improvements in AI reasoning might not be genuine. They suggest that better ways to check AI's performance are needed to ensure it really understands and isn't just memorizing data.

AI is a Shoggoth

GOOD INTERNET • 23 implied HN points • 06 Mar 23

🕹 Technology Deep Learning

AI in the digital world is becoming increasingly strange and difficult to understand, akin to Lovecraftian horror.
The ability of AI to connect disparate information can lead to collective delusions and conspiracy theories like Qanon.
AI's evolving features, like voice cloning and reinforcement learning, show similarities to Lovecraft's description of Shoggoths.

The Belamy | Metaverse, Robot Tax & AI-Ready Youth

Sector 6 | The Newsletter of AIM • 19 implied HN points • 12 Sep 21

🕹 Technology Deep Learning

The metaverse is a growing digital space where people can interact and create, much like the real world. It's becoming an important part of our online experience.
There is a discussion about a 'robot tax', which would be a tax on companies that use robots to replace human jobs. This could help address job loss due to automation.
Preparing young people for an AI-driven future is crucial. Education systems are starting to include skills related to AI and technology to better equip the next generation.

How does batching work on modern GPUs?

Artificial Fintelligence • 8 implied HN points • 01 Mar 24

🕹 Technology Deep Learning

Batching is a key optimization for modern deep learning systems, allowing for processing multiple inputs simultaneously without significant time overhead.
Modern GPUs run operations concurrently, leading to no additional time needed as batch sizes increase up to a certain threshold.
For convolutional networks, the advantage of batching is reduced compared to other models due to the reuse of weights across multiple instances.

Deep Learning Is Better Than Linear Regression

As Clay Awakens • 2 HN points • 19 Mar 23

🕹 Technology Deep Learning

Linear regression is a reliable, stable, and simple technique with a long history of successful applications.
Deep learning, especially non-linear regression, has shown significant advancements over the past decade and can outperform linear regression in many real-world tasks.
Deep learning models have the ability to automatically learn and discover complex features, making them advantageous over manually engineered features in linear regression.

LLaMA: LLMs for Everyone!

Deep (Learning) Focus • 2 HN points • 10 Apr 23

🕹 Technology Deep Learning

LLaMA provides a collection of open-source LLMs with different sizes for better efficiency.
LLaMA models perform surprisingly well, even outperforming larger models in some cases.
LLaMA challenges the trend of needing massive models by showing the effectiveness of smaller, extensively pre-trained LLMs.

Data Science Weekly - Issue 372

Data Science Weekly Newsletter • 19 implied HN points • 07 Jan 21

🕹 Technology Deep Learning

DALL·E is a powerful AI that creates images from text descriptions, showcasing its ability to combine different ideas and concepts in creative ways.
Machine learning is making significant strides in healthcare, but it also comes with risks that need careful consideration to ensure patient safety.
Transformers have revolutionized natural language processing and are now being applied to various tasks in computer vision, improving how we manage data.

Data Science Weekly - Issue 368

Data Science Weekly Newsletter • 19 implied HN points • 10 Dec 20

🕹 Technology Deep Learning

Machine learning needs systematic approaches to create strong systems for real-world use. This means looking beyond just algorithms to see the bigger picture.
Deep neural networks are powerful, but understanding how they work can be tricky. Tools like network dissection can help us figure out what these networks are really doing.
Feature stores are becoming important for machine learning. They allow teams to share and manage data better for creating and deploying models quickly.

Why I'm allergic to AI hype

More is Different • 7 implied HN points • 06 Jan 24

🕹 Technology Deep Learning

Data science jobs may not be as glamorous as they seem, often involving mundane tasks and not much intellectual excitement.
Efforts to create AGI have faced challenges, with ambitious projects like Mindfire encountering skepticism and practical difficulties.
AI in healthcare, such as for radiology, has seen startups struggle and face issues like lack of affordability, deployment challenges, and unpredictability in performance.

Deep Learning Platform, TinyML, Privacy ↔ Contact Tracing

Gradient Flow • 19 implied HN points • 07 May 20

🕹 Technology Deep Learning

Deep learning models are being implemented in tiny devices with tools like TinyML for ultra-low-power systems.
Distributed training for deep learning models is made simpler and cheaper with libraries like RaySGD.
Technology like facial recognition for contact tracing can also raise concerns about privacy and mass surveillance.

Post-Transformers - Hyena Hierarchy

Why Now • 8 implied HN points • 04 Sep 23

🕹 Technology Deep Learning

Hyena clans have a linear dominance hierarchy with one-to-one chain of command
LLMs like Transformers face challenges with attention mechanisms due to scaling limitations
Hyena proposes a sub-quadratic solution to attention via long-convolutions and data-controlled gating

Data Science Weekly - Issue 349

Data Science Weekly Newsletter • 19 implied HN points • 30 Jul 20

🕹 Technology Deep Learning

Deep learning has important ideas that have been around for a while. If you're new to it, learning these basics can really help you understand current research.
GPT-3 is creating a lot of buzz, and it's important to think critically about the hype. Understanding the difference between hype and reality helps us navigate new technologies better.
Evaluating machine learning models is similar to testing software. New methods can help us better assess how well these models work, which is key to making them reliable.

Data Science Weekly - Issue 348

Data Science Weekly Newsletter • 19 implied HN points • 23 Jul 20

🕹 Technology Deep Learning

Deep Learning papers can be confusing for beginners, but there's a roadmap to help you choose where to start. It's a good way to navigate through the vast amount of research out there.
Machine Learning is creating a lot of value for businesses, and it's important to understand how this value can be captured. Different companies are finding unique ways to apply ML for their needs.
New techniques in AI, like using neural networks for soundscapes, are not just tech innovations but can also help protect the environment. It shows how technology can contribute to nature conservation.

Data Science Weekly - Issue 332

Data Science Weekly Newsletter • 19 implied HN points • 02 Apr 20

🕹 Technology Deep Learning

Agent57 is a new deep learning agent that can beat human scores in all Atari games. It's a big step forward in how we measure AI performance.
During the COVID-19 crisis, it's important to approach data honestly and with curiosity. This helps individuals responsibly discuss topics outside their expertise.
ACM is offering free access to their digital library to support research and learning during the pandemic. This allows more people to access valuable computing resources.

Data Science Weekly - Issue 323

Data Science Weekly Newsletter • 19 implied HN points • 30 Jan 20

🕹 Technology Deep Learning

Data cleaning is a big part of a data scientist's job. Many great ideas can get stuck because people can't access or use the right data.
Choosing the right settings, called hyperparameters, greatly impacts a machine learning project's success. There are smarter ways to find these settings than just guessing.
Learning is easier when it's structured step by step. Using a curriculum helps models learn complex tasks bit by bit, just like how people learn.

The hottest Deep Learning Substack posts right now

Democratizing Automation • 213 implied HN points • 22 Nov 23

Sector 6 | The Newsletter of AIM • 39 implied HN points • 04 Sep 23

The Gradient • 144 implied HN points • 13 Jan 24

The Digital Anthropologist • 19 implied HN points • 04 Jan 24

TheSequence • 35 implied HN points • 07 Jan 25

Technology Made Simple • 39 implied HN points • 06 Dec 22

TheSequence • 161 implied HN points • 15 Mar 23

Yuxi’s Substack • 19 implied HN points • 15 Feb 23

Three Data Point Thursday • 19 implied HN points • 22 Jun 23

Mike Talks AI • 19 implied HN points • 27 Apr 23

Sector 6 | The Newsletter of AIM • 19 implied HN points • 25 Jul 23

Technology Made Simple • 59 implied HN points • 03 May 22

Marcus on AI • 61 HN points • 10 Mar 24

Sector 6 | The Newsletter of AIM • 39 implied HN points • 07 Nov 22

Decoding Coding • 1 HN point • 19 Jul 24

Technology Made Simple • 19 implied HN points • 04 Dec 22

Technology Made Simple • 19 implied HN points • 25 Oct 22

Gradient Flow • 39 implied HN points • 26 Aug 21

Data Science Weekly Newsletter • 19 implied HN points • 01 Sep 22

Data Science Weekly Newsletter • 19 implied HN points • 28 Jul 22

Perceptions • 35 implied HN points • 17 Feb 23

The Gradient • 29 implied HN points • 22 Apr 23

Apperceptive (moved to buttondown) • 20 implied HN points • 02 Nov 23

The Palindrome • 2 implied HN points • 16 Jul 25

ppdispatch • 2 implied HN points • 18 Jul 25

Artificial Fintelligence • 19 implied HN points • 07 Sep 23

GOOD INTERNET • 23 implied HN points • 06 Mar 23

Sector 6 | The Newsletter of AIM • 19 implied HN points • 12 Sep 21

Artificial Fintelligence • 8 implied HN points • 01 Mar 24

As Clay Awakens • 2 HN points • 19 Mar 23

Deep (Learning) Focus • 2 HN points • 10 Apr 23

Data Science Weekly Newsletter • 19 implied HN points • 07 Jan 21

Data Science Weekly Newsletter • 19 implied HN points • 10 Dec 20

More is Different • 7 implied HN points • 06 Jan 24

Gradient Flow • 19 implied HN points • 07 May 20

Why Now • 8 implied HN points • 04 Sep 23

Data Science Weekly Newsletter • 19 implied HN points • 30 Jul 20

Data Science Weekly Newsletter • 19 implied HN points • 23 Jul 20

Data Science Weekly Newsletter • 19 implied HN points • 02 Apr 20

Data Science Weekly Newsletter • 19 implied HN points • 30 Jan 20