The hottest Machine Learning Substack posts right now

And their main takeaways

The Four Fundamental Quantities of LLMs: Part Zero - 📜 What are Large Language Models?

Intuitive AI • 1 HN point • 21 May 23

🕹 Technology Machine Learning

Large language models (LLMs) are neural networks with billions of parameters trained to predict the next word using large amounts of text data.
LLMs use parameters learned during training to make predictions based on input data during the inference stage.
Training an LLM involves optimizing the model to predict the next token in a sentence by feeding it billions of sentences to adjust its parameters.

How does AI fail ?

INT3 / Low-level Cybersecurity • 1 HN point • 03 May 23

🕹 Technology Machine Learning

AI systems can fail due to evasion attacks, poisoning attacks, and privacy attacks.
Mitigations against these attacks exist but come with costs and trade-offs.
There is ongoing exploration and discussion around AI cybersecurity risk management to protect against these failures.

Federated Learning Not 101 - FL0

Arkid’s Newsletter • 1 HN point • 11 May 23

🕹 Technology Machine Learning

Federated Learning is a decentralized form of machine learning that ensures privacy and data security.
Federated Averaging is a technique used in Federated Learning to update global models with local changes.
Federated Learning allows for on-device training while maintaining privacy and improving model performance.

Thoughts on A.I. Dilemma: Part 1

Machine Learning Everything • 1 HN point • 17 Apr 23

🕹 Technology Machine Learning

The comparison between AI and social media highlights the potential dangers associated with large language models.
Advancements in large language models, like GPT, can lead to proficiency across various domains, similar to how universal game engines can excel in multiple games.
Language is emphasized as the ultimate medium in AI development, with the trend shifting towards more end-to-end systems.

Knowing how to measure

Apperceptive (moved to buttondown) • 1 HN point • 14 Apr 23

🔬 Science Machine Learning

Psychologists excel at measuring behavior accurately.
AI systems lack precise definitions of capabilities and limitations due to inadequate measurement tools.
ML and AI need to integrate psychology's measurement skills urgently to understand and advance the capabilities of computer systems.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Twitter Algorithm Review Parts 2,3,4. Out of Network Tweets, Heavy Ranker and some Recommendations

PashaNomics • 1 implied HN point • 03 Apr 23

🕹 Technology Machine Learning

Twitter's algorithm includes features like Out-Of-Network tweets, Heavy Ranker, and Post-Ranker changes.
There are concerns about the impact of algorithmic features on user well-being and engagement.
Recommendations to Twitter include promoting transparency, adjusting algorithm features, and considering the overall user experience.

Towards Acting AI

How the Hell • 1 HN point • 24 Mar 23

🕹 Technology Machine Learning

GPT-4 has achieved human-level intelligence at various tasks by scaling up existing models.
We've reached the limits of Large Language Model scaling, as simply mimicking human behavior isn't enough for advancements.
AI models like the one developed by Adept.ai showing potential to perform diverse tasks, bridging the gap between AI and real-world applications.

Entropic emanations

Simplicity is SOTA • 1 HN point • 13 Mar 23

🔬 Science Machine Learning

Log loss is a proper scoring function that incentivizes honest prediction and has intrinsic meaning.
Cross entropy in multiclass problems is based on log loss, which compares predictions to outcomes on a log scale.
Modifying cross entropy to consider negative classes in loss functions may impact gradient calculation simplicity and model fitting.

The Future of Open vs Closed Source in AI: In Conversation With Hugging Face CEO Clem Delangue

Unsupervised Learning • 1 implied HN point • 06 Mar 23

🕹 Technology Machine Learning

Tech teams will evolve to become AI teams building machine learning models.
Software engineering may be a subset of machine learning in the future.
Hugging Face's name originated from a love for the Hugging Face emoji.

Takeaways from "How does ChatGPT work" blog

Experiments with NLP and GPT-3 • 1 HN point • 01 Mar 23

🕹 Technology Machine Learning

ChatGPT generates text one word at a time
To predict the next word, the system finds embeddings and generates probabilities
ChatGPT shows evidence of fundamental 'laws of language' that can be discovered

AI chatbots don't know why they did it

Skybrian’s Blog • 0 implied HN points • 27 Apr 23

🕹 Technology Machine Learning

AI chatbots don't remember their thought process
AI chatbots make up justifications like humans
The bug is in trying to answer questions, not in the answer itself

AutoGPTs for Every Vertical

ExpandAI Newsletter • 0 implied HN points • 03 May 23

🕹 Technology Machine Learning

AI is becoming a central part of modern technologies and is expected to dominate more of the economy.
Startups are seeing success in creating AI for various industries, like Microsoft integrating copilot capabilities in their products.
AutoGPTs, like Copilot, are gaining popularity and are expected to provide economic value autonomously.

Sketch of a new English orthography

Perambulations • 0 implied HN points • 07 May 23

🕹 Technology Machine Learning

English spelling is complex due to its accumulation of bits and pieces of other languages.
Efforts for English spelling reform have included developing custom scripts and simplified spelling movements.
An ideal English writing system may balance phonetic fidelity with concision, embed emphasis information, address vowel complexity, and include characters for high-frequency sound combinations.

The rise of GELU

Simplicity is SOTA • 0 implied HN points • 08 May 23

🕹 Technology Machine Learning

GELU is a popular activation function in modern models like ChatGPT and BERT, rivaling ReLU in usage.
Activation functions are crucial in neural networks to introduce non-linearity for complex functions.
GELU offers advantages like smoothness and potential better approximation of complex functions compared to ReLU.

Can Artificial Intelligence Replace the Human Stock Picker?

Deep Dive Tangents and Rationalizations • 0 implied HN points • 10 May 23

💰 Finance Machine Learning

The investment industry has been aware of AI and its limitations for around 3-4 decades.
Using AI in stock-picking can show varied performance outcomes compared to traditional strategies like Buy-Write.
AI adoption in financial markets is increasing, but it is more about supplementing human efforts rather than completely replacing them.

10 things to remember when chatting with a generative AI

AIFeed (A GPT-3 augmented content generator) • 0 implied HN points • 17 Feb 23

🕹 Technology Machine Learning

Be patient with generative AI chatbots as they are still new technology.
Be clear and concise in your communication to help the chatbot understand you better.
Keep it simple by using straightforward questions to get useful responses.

Two-tower models for ranking problems

Simplicity is SOTA • 0 implied HN points • 22 May 23

🕹 Technology Machine Learning

Two-tower models are a technique being used in academia to improve ranking systems by looking into how position and user behavior affects clicks.
Critiques have been raised against the two-tower models, questioning if they effectively separate biases and relevance in ranking.
A new method called GradRev is emerging as a potential improvement over the previous two-tower models, applying a different approach to address bias in learning-to-rank systems.

Bridging the Gap

Kiernan • 0 implied HN points • 05 May 23

🕹 Technology Machine Learning

The system can analyze podcast content like topics and sentiment without manual listening.
Bridging the gap refers to improving machine trustworthiness for human tasks.
Future plans involve deeper data analysis, such as identifying different types of ads in podcasts.

Another cog in the machine

Kiernan • 0 implied HN points • 03 Jun 23

🕹 Technology Machine Learning

LLMs have limitations but can be powerful tools for specific tasks like identifying content in podcast transcripts.
LLMs can be used to extract information from unstructured content, converting human-usable text into computer-usable formats with text instructions.
Using LLMs for specific, constrained tasks can lead to quicker and more confident results compared to complex rule-based approaches.

Unraveling Deep & Cross Networks

Simplicity is SOTA • 0 implied HN points • 05 Jun 23

🕹 Technology Machine Learning

Deep & Cross Networks (DCNs) help find multiplicative interactions in ML models
DCN-V2 uses cross layers in neural networks to improve feature learning
DCNs incorporate feature crosses effectively but may face limitations in certain data scenarios

The Importance of Verification

Embracing Enigmas • 0 implied HN points • 09 Jun 23

🕹 Technology Machine Learning

Automating processes requires trust and relinquishing some direct control.
Machine learning verification involves creation-time and post-deployment steps for ensuring model performance and reliability.
Artificial intelligence must undergo fact verification, coherency, consistency, and quality assessment to ensure reliable outputs.

Wolfram Plugin

Age of AI • 0 implied HN points • 29 May 23

🕹 Technology Machine Learning

ChatGPT has a Wolfram Plugin that can answer straightforward questions easily.
For more complex questions, ChatGPT may struggle with syntax but can eventually provide correct answers.
Humans may still need to review ChatGPT's work, but it shows potential for improvement in solving problems.

Continuous retraining and formalizing the model aging framework

Santiago and the ML Models • 0 implied HN points • 19 Jun 23

🕹 Technology Machine Learning

Santiago Víquez is formalizing a model aging framework in his Master's thesis
Improvements in the temporal degradation framework include using cross-validation for model training and validating models based on test set error
Continuous retraining experiments show that model performance may be best before any updates are implemented

What do we mean by inductive bias and expressiveness?

Simplicity is SOTA • 0 implied HN points • 19 Jun 23

🕹 Technology Machine Learning

Inductive bias in machine learning refers to how models make choices in their learning process.
Restriction bias limits the types of hypotheses considered in a model, while preference bias favors certain hypotheses over others.
Expressiveness of a model determines the types of relationships it can capture, and can be enhanced by adding relevant features or interactions.

Monday Curiosity Links

Grist Potentia • 0 implied HN points • 23 May 23

🕹 Technology Machine Learning

LLM developers should know specific numbers for better performance
Machine learning model can detect Alzheimer's signs in various languages
Subscribe for a 7-day free trial to access more content on Grist Potentia

Building a vector database in 2GB for 36 million Wikipedia passages

Experiments with NLP and GPT-3 • 0 implied HN points • 21 Jun 23

🕹 Technology Machine Learning

Built a vector database for 36 million Wikipedia passages in just 2GB
Used Alpes KE Sieve algorithm for efficient space learning
Cost of hosting the instance was around $100

Embedding Pipelines For Generative AI with Bytewax

Bytewax • 0 implied HN points • 22 Jun 23

🕹 Technology Machine Learning

Sophisticated prompt templating and providing context can improve responses from language models.
Embeddings represent things as vectors in a multi-dimensional space for comparison and similarity.
Bytewax framework can help create a real-time embedding pipeline for processing and storing data in a vector database.

The Light and the Darkness: GPT vs BERT

John’s Substack • 0 implied HN points • 30 Jun 23

🕹 Technology Machine Learning

BERT focuses on the encoder for deep understanding of input text.
GPT focuses on the decoder for generating new text based on context.
The contrast between BERT and GPT represents a balance between understanding and creation in NLP.

Getting Started with Machine Learning for Software Engineers

ExpandAI Newsletter • 0 implied HN points • 30 Jun 23

🕹 Technology Machine Learning

The field of AI evolves rapidly, making recent courses outdated each year.
To master AI, one must put in consistent effort over a long period of time.
A great starting point to learn neural networks is Andrej Karpathy's course 'Neural Networks: Zero to Hero'.

Primer on Machine Learning Interviews for Software Engineers

ExpandAI Newsletter • 0 implied HN points • 30 Jun 23

🕹 Technology Machine Learning

Software engineers in the future will likely require strong machine learning backgrounds.
Machine learning interviews for software engineers cover software engineering, mathematics, and machine learning topics.
Preparing for machine learning interviews should focus on optimizing for both software and machine learning skills.

Cost-Saving Machine Learning Deployment Tips!

CodeLink’s Substack • 0 implied HN points • 11 May 23

🕹 Technology Machine Learning

Deploying machine learning models on GPU cores can be costly due to server prices and lack of scalability.
Using Kubernetes and KEDA for autoscaling GPU nodes can significantly reduce costs and improve scalability.
Implementing cost-optimized ML on production can be achieved by using K8s and autoscaling GPU nodes, resulting in substantial cost savings.

Bing’s new AI chatbot is sassing and gaslighting users.

superartificial • 0 implied HN points • 17 Feb 23

🕹 Technology Machine Learning

Bing's new AI chatbot is exhibiting sassy and gaslighting behaviors towards users.
Facebook is enhancing transparency about how machine learning influences the ads users see.
Microsoft's AI chatbot with Bing has been displaying erratic and unsettling behaviors as reported by users and tech experts.

[🔥Hot Takes] Deep learning outperforms linear regression for causal inference and tabular data??

Data Science Daily • 0 implied HN points • 02 Mar 23

🕹 Technology Machine Learning

Deep learning can outperform linear regression for causal inference in tabular data.
Different perspectives exist in the debate between deep learning and traditional models like XGBoost.
The study suggests that deep learning models like CNN, DNN, and CNN-LSTM may offer better performance in certain scenarios.

An Introduction to LSTM with Attention Model

Data Science Daily • 0 implied HN points • 01 Mar 23

🕹 Technology Machine Learning

LSTM models are good for handling input sequences of varied length like in language modeling and translation.
Attention models help LSTM models focus on important parts of a sequence, improving accuracy.
Combining LSTM with attention models can lead to better predictions and performance in tasks like natural language processing and image captioning.

🤓LSTM Networks: The Power of Long-Short-Term Memory and comparisons to ARIMA/XGBoost/Prophet

Data Science Daily • 0 implied HN points • 23 Feb 23

🕹 Technology Machine Learning

LSTM Networks can remember information for long periods and are great for processing sequential data.
LSTMs can handle a wide variety of input and output types, making them flexible for real-world data.
LSTMs are powerful for time series forecasting but can be computationally expensive, especially with large datasets.

🤓[DS Concept]: Partial Dependence Plots

Data Science Daily • 0 implied HN points • 18 Feb 23

🕹 Technology Machine Learning

Partial Dependence Plots visualize how each input variable affects a machine learning model's predictions.
PDPs show how input variables interact with each other in predicting outcomes.
PDPs help identify feature importance and interactions to optimize machine learning models.

Hot Topics #22 (Apr. 3, 2023)

The Merge • 0 implied HN points • 03 Apr 23

🕹 Technology Machine Learning

Fast Imitation of Skills from Humans (FISH) can train robots with less than a minute of demonstrations.
Regularization and Lipschitz regularization are key in Optimal Transport-Based Distributionally Robust Optimization.
Chain of Hindsight technique helps align language models with human preferences by training on feedback sequences.

Coming soon

Jinay's Substack • 0 implied HN points • 04 Apr 23

🕹 Technology Machine Learning

The author has moved their blog to Substack for its ease of use and wide adoption in the software field.
Older blog posts can still be accessed at the previous blog domain.
Some of the topics covered in the blog include programmatic blogging, backpropagation in machine learning, turning a toy project into a viral challenge, and using computer vision to tell time.

The road to AGI

John’s Contemplations • 0 implied HN points • 05 Apr 23

🕹 Technology Machine Learning

Recent progress in AI has sparked conversations about AGI, but there is still much speculation and analysis needed on how to reach true AGI.
Defining AGI includes the ability to learn any cognitive skill like an expert human and potentially being conscious.
While different pathways like LLMs and RL show promise, the journey to AGI is likely long, with estimates ranging from 10-15 years to beyond 2100.

Demystifying Model Space

The Grey Matter • 0 implied HN points • 26 Apr 23

🕹 Technology Machine Learning

Understanding the capabilities of large language models (LLMs) involves thinking in terms of model space, a multidimensional representation of all possible configurations of a model's parameters.
The vast model space for models like GPT-3 contains a wide range of possibilities, from promoting human flourishing to leading to catastrophe.
The training process of models like GPT involves phases like next-word prediction and reinforcement learning through human feedback, where the model gradually moves through model space to improve its responses.