Simplicity is SOTA

Simplicity is SOTA focuses on insights into machine learning, software engineering practices, and the dynamics of marketplaces. It critically examines established methodologies, discussing the impact of data sampling, experiment design, model biases, and the effectiveness of various machine learning techniques and programming concepts in improving prediction, optimization, and system design.

Machine Learning Techniques Software Engineering Practices Data Sampling and Analysis Experiment Design and Analysis Model Biases and Calibration Programming Concepts and Optimization Marketplace Dynamics Position Bias and User Behavior Activation Functions and Model Architecture Marketing Effectiveness Measurement

The hottest Substack posts of Simplicity is SOTA

And their main takeaways
122 HN points 10 Apr 23
  1. The standard use of p < 0.05 as a threshold in experiment analysis may not be as useful as commonly believed.
  2. The choice of p < 0.05 as a significance level in experiments is a default that was set nearly a century ago.
  3. In the tech industry, where the goal is to find real product improvements, the risk of false negatives should also be carefully considered, not just false positives.
2 HN points 27 Mar 23
  1. The concept of 'embedding' in machine learning has evolved and become widely used, replacing terms like vectors and representations.
  2. Embeddings can be applied to various types of data, come from different layers in a neural network, and are not always about reducing dimensions.
  3. Defining 'embedding' has become challenging due to its widespread use, but the essence is about learned transformations that make data more useful.
2 HN points 24 Oct 22
  1. Balanced datasets provide more information for any fixed sample size.
  2. Balancing samples through oversampling or undersampling may lead to overfitting and decreased model performance.
  3. Balancing datasets may not necessarily improve class separation and can negatively impact calibration and log loss.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
1 HN point 13 Mar 23
  1. Log loss is a proper scoring function that incentivizes honest prediction and has intrinsic meaning.
  2. Cross entropy in multiclass problems is based on log loss, which compares predictions to outcomes on a log scale.
  3. Modifying cross entropy to consider negative classes in loss functions may impact gradient calculation simplicity and model fitting.
0 implied HN points 05 Dec 22
  1. The order in which search results are presented matters a lot because it affects user clicks.
  2. Models that ignore position when predicting outcomes can struggle to generalize when the ranking criteria change.
  3. Including position as a feature in models can improve accuracy in predicting user behavior, especially in scenarios where the model's functional form aligns with the behavior being modeled.
0 implied HN points 07 Nov 22
  1. Having more data may not always improve model performance, especially when data recency is important.
  2. In some domains where data is accumulated over time, the most recent data may be the most representative for predicting the near future.
  3. Using simulations to evaluate different sample durations can help determine the ideal sample size for model training.
0 implied HN points 19 Dec 22
  1. Using randomization combined with a simple approach can help measure position bias in ranking systems effectively.
  2. The position-based propensity model (PBM) offers a straightforward and flexible way to model user behavior in ranking systems.
  3. Randomizing search results, even through slight variations like RandPair, can provide clean and sensible estimates of position bias factors at a lower cost and with solid results.
0 implied HN points 21 Nov 22
  1. Sample size has a significant impact on model performance, with diminishing returns after a certain point.
  2. Training time complexity often increases linearly or worse than linearly, eventually surpassing benefits from additional samples.
  3. Parallelism in computing can help, but still comes with incremental costs and system scalability limitations.
0 implied HN points 19 Jun 23
  1. Inductive bias in machine learning refers to how models make choices in their learning process.
  2. Restriction bias limits the types of hypotheses considered in a model, while preference bias favors certain hypotheses over others.
  3. Expressiveness of a model determines the types of relationships it can capture, and can be enhanced by adding relevant features or interactions.
0 implied HN points 11 Mar 24
  1. Benchmark datasets are crucial in ML literature, providing a standard for evaluating new methods and influencing research directions.
  2. In learning-to-rank, the Yahoo and Microsoft datasets are prominent, with Yahoo dataset being widely used in notable papers.
  3. When writing a paper using benchmark datasets, researchers must choose ML algorithms, consider user behavior, generate initial rankings, and evaluate performance with metrics like NDCG.
0 implied HN points 17 Jul 23
  1. A model of everything predicts final and intermediate goals of a company, is causal, and covers significant inputs.
  2. Foundational choices in building a model of everything include deciding the scope, complexity of relationships, and optimization strategy.
  3. Financial forecasting often involves models of everything, built in spreadsheets, but may not work well for machine learning models.
0 implied HN points 24 Apr 23
  1. Experimentation regimes should focus on maximizing profit or a chosen metric, rather than blindly following traditional hypothesis testing rules.
  2. When designing experiments, prioritize actions that maximize profit, such as choosing sample size and treatment allocation strategically.
  3. Comparing the Test & Roll method with multi-armed bandits shows that both aim to maximize profit but have different complexities and decision-making structures.
0 implied HN points 22 May 23
  1. Two-tower models are a technique being used in academia to improve ranking systems by looking into how position and user behavior affects clicks.
  2. Critiques have been raised against the two-tower models, questioning if they effectively separate biases and relevance in ranking.
  3. A new method called GradRev is emerging as a potential improvement over the previous two-tower models, applying a different approach to address bias in learning-to-rank systems.
0 implied HN points 12 Feb 24
  1. Position bias can affect the inputs of machine learning models when features reflect prior user behavior, leading to biased estimations of relevance.
  2. Using inverse propensity weighting (IPW) like IPW-CTR can help mitigate position bias in features, but it can result in high variance due to dividing by small numbers.
  3. The choice of weights to measure position bias is crucial, as observed click propensities may overestimate the bias, impacting the performance of features designed to address bias-variance trade-offs.
0 implied HN points 02 Jan 23
  1. Marketing effectiveness is measured with Return on Ad Spend (ROAS) which quantifies revenue attributed to advertising.
  2. Incremental revenue from advertising campaigns can be much lower than non-incremental revenue, making lift measurement crucial.
  3. Observational methods for measuring marketing impact are often inaccurate compared to randomized control trials due to bias and selection effects.