Simplicity is SOTA

Simplicity is SOTA focuses on insights into machine learning, software engineering practices, and the dynamics of marketplaces. It critically examines established methodologies, discussing the impact of data sampling, experiment design, model biases, and the effectiveness of various machine learning techniques and programming concepts in improving prediction, optimization, and system design.

Machine Learning Techniques Software Engineering Practices Data Sampling and Analysis Experiment Design and Analysis Model Biases and Calibration Programming Concepts and Optimization Marketplace Dynamics Position Bias and User Behavior Activation Functions and Model Architecture Marketing Effectiveness Measurement

The hottest Substack posts of Simplicity is SOTA

And their main takeaways

The Monty Hall problem: The golden goat variation

131 implied HN points • 03 Feb 25

The Monty Hall problem has a new twist, focusing on a valuable goat instead of a car. In this version, knowing which goat is valuable affects your choice.
Using Bayes' theorem can help calculate the probabilities in this variation. After a goat is revealed, you can reassess your chances to make a better decision.
The essential lesson is to update your beliefs with new information. Recognizing how new clues impact your choices is key to making smarter decisions.

p < 0.05 considered harmful

122 HN points • 10 Apr 23

🔬 Science Experimentation Statistical Analysis Decision-making Tech industry

The standard use of p < 0.05 as a threshold in experiment analysis may not be as useful as commonly believed.
The choice of p < 0.05 as a significance level in experiments is a default that was set nearly a century ago.
In the tech industry, where the goal is to find real product improvements, the risk of false negatives should also be carefully considered, not just false positives.

What is an embedding, anyways?

2 HN points • 27 Mar 23

🕹 Technology Machine Learning Neural Networks Embeddings Data Transformation

The concept of 'embedding' in machine learning has evolved and become widely used, replacing terms like vectors and representations.
Embeddings can be applied to various types of data, come from different layers in a neural network, and are not always about reducing dimensions.
Defining 'embedding' has become challenging due to its widespread use, but the essence is about learned transformations that make data more useful.

Looking for the simplest change that will eliminate the most bugs

2 HN points • 27 Feb 23

🕹 Technology Software Engineering Testing Process Improvement Code Review

Manual testing right before launching a feature can catch a lot of bugs quickly.
Analyzing user data after starting an experiment but before heavy traffic is useful for bug detection.
Implementing small process changes in existing workflows can lead to significant bug reduction and better outcomes.

Shared constants across programming languages

2 HN points • 13 Feb 23

🕹 Technology Programming Software Development

Defining constants as strings deep in code can lead to bugs, use defined variables instead.
Sharing constants across different programming languages and repositories can be challenging.
Protobuf or Thrift can help prevent bugs by generating native variables for shared constants across languages.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Balanced samples are good, balancing samples is bad

2 HN points • 24 Oct 22

Balanced datasets provide more information for any fixed sample size.
Balancing samples through oversampling or undersampling may lead to overfitting and decreased model performance.
Balancing datasets may not necessarily improve class separation and can negatively impact calibration and log loss.

Entropic emanations

1 HN point • 13 Mar 23

🔬 Science Statistics Machine Learning Probability

Log loss is a proper scoring function that incentivizes honest prediction and has intrinsic meaning.
Cross entropy in multiclass problems is based on log loss, which compares predictions to outcomes on a log scale.
Modifying cross entropy to consider negative classes in loss functions may impact gradient calculation simplicity and model fitting.

When more sample leads to worse models

0 implied HN points • 07 Nov 22

Having more data may not always improve model performance, especially when data recency is important.
In some domains where data is accumulated over time, the most recent data may be the most representative for predicting the near future.
Using simulations to evaluate different sample durations can help determine the ideal sample size for model training.

A simple and robust approach to measuring position bias

0 implied HN points • 19 Dec 22

Using randomization combined with a simple approach can help measure position bias in ranking systems effectively.
The position-based propensity model (PBM) offers a straightforward and flexible way to model user behavior in ranking systems.
Randomizing search results, even through slight variations like RandPair, can provide clean and sensible estimates of position bias factors at a lower cost and with solid results.

Why we discard useful sample

0 implied HN points • 21 Nov 22

Sample size has a significant impact on model performance, with diminishing returns after a certain point.
Training time complexity often increases linearly or worse than linearly, eventually surpassing benefits from additional samples.
Parallelism in computing can help, but still comes with incremental costs and system scalability limitations.

A story about validating language models

0 implied HN points • 14 Aug 23

🕹 Technology AI Validation Data Testing Development

Validating language models for inappropriate content is crucial to maintain trustworthiness.
Building confidence in a model's performance through rigorous testing can prevent potential issues.
Structuring data outputs for human review can significantly improve efficiency in evaluating model responses.

What do we mean by inductive bias and expressiveness?

0 implied HN points • 19 Jun 23

🕹 Technology Machine Learning Neural Networks Models

Inductive bias in machine learning refers to how models make choices in their learning process.
Restriction bias limits the types of hypotheses considered in a model, while preference bias favors certain hypotheses over others.
Expressiveness of a model determines the types of relationships it can capture, and can be enhanced by adding relevant features or interactions.

Unraveling Deep & Cross Networks

0 implied HN points • 05 Jun 23

🕹 Technology Machine Learning Deep Learning Neural Networks Feature Engineering Model optimization

Deep & Cross Networks (DCNs) help find multiplicative interactions in ML models
DCN-V2 uses cross layers in neural networks to improve feature learning
DCNs incorporate feature crosses effectively but may face limitations in certain data scenarios

Benchmark datasets for learning-to-rank

0 implied HN points • 11 Mar 24

🕹 Technology Machine Learning Datasets Research Algorithms

Benchmark datasets are crucial in ML literature, providing a standard for evaluating new methods and influencing research directions.
In learning-to-rank, the Yahoo and Microsoft datasets are prominent, with Yahoo dataset being widely used in notable papers.
When writing a paper using benchmark datasets, researchers must choose ML algorithms, consider user behavior, generate initial rankings, and evaluate performance with metrics like NDCG.

A model of everything

0 implied HN points • 17 Jul 23

🕹 Technology Data Modeling Machine Learning Financial forecasting

A model of everything predicts final and intermediate goals of a company, is causal, and covers significant inputs.
Foundational choices in building a model of everything include deciding the scope, complexity of relationships, and optimization strategy.
Financial forecasting often involves models of everything, built in spreadsheets, but may not work well for machine learning models.

Profit-maximizing experimentation regimes

0 implied HN points • 24 Apr 23

💼 Business Marketing Experimentation Decision-making Comparative Analysis

Experimentation regimes should focus on maximizing profit or a chosen metric, rather than blindly following traditional hypothesis testing rules.
When designing experiments, prioritize actions that maximize profit, such as choosing sample size and treatment allocation strategically.
Comparing the Test & Roll method with multi-armed bandits shows that both aim to maximize profit but have different complexities and decision-making structures.

The rise of GELU

0 implied HN points • 08 May 23

🕹 Technology Neural Networks Machine Learning Models Artificial Intelligence

GELU is a popular activation function in modern models like ChatGPT and BERT, rivaling ReLU in usage.
Activation functions are crucial in neural networks to introduce non-linearity for complex functions.
GELU offers advantages like smoothness and potential better approximation of complex functions compared to ReLU.

Two-tower models for ranking problems

0 implied HN points • 22 May 23

🕹 Technology Big Data Machine Learning Research Neural Networks Algorithm

Two-tower models are a technique being used in academia to improve ranking systems by looking into how position and user behavior affects clicks.
Critiques have been raised against the two-tower models, questioning if they effectively separate biases and relevance in ranking.
A new method called GradRev is emerging as a potential improvement over the previous two-tower models, applying a different approach to address bias in learning-to-rank systems.

Position bias in features

0 implied HN points • 12 Feb 24

🔬 Science Machine Learning Bias Features Ranking Simulation

Position bias can affect the inputs of machine learning models when features reflect prior user behavior, leading to biased estimations of relevance.
Using inverse propensity weighting (IPW) like IPW-CTR can help mitigate position bias in features, but it can result in high variance due to dividing by small numbers.
The choice of weights to measure position bias is crucial, as observed click propensities may overestimate the bias, impacting the performance of features designed to address bias-variance trade-offs.

Measuring the impact of marketing

0 implied HN points • 02 Jan 23

Marketing effectiveness is measured with Return on Ad Spend (ROAS) which quantifies revenue attributed to advertising.
Incremental revenue from advertising campaigns can be much lower than non-incremental revenue, making lift measurement crucial.
Observational methods for measuring marketing impact are often inaccurate compared to randomized control trials due to bias and selection effects.

The position bias problem, with code

0 implied HN points • 05 Dec 22

The order in which search results are presented matters a lot because it affects user clicks.
Models that ignore position when predicting outcomes can struggle to generalize when the ranking criteria change.
Including position as a feature in models can improve accuracy in predicting user behavior, especially in scenarios where the model's functional form aligns with the behavior being modeled.