MLOps Newsletter $7 / month

The MLOps Newsletter focuses on the latest developments in machine learning operations, including detailed explanations of algorithms, breakthroughs in large language models (LLMs), applications in diverse fields such as climate forecasting and digital platforms, as well as tools and frameworks for enhancing ML model efficiency and accessibility.

Machine Learning Operations Large Language Models Recommender Systems Model Optimization and Inference Generative AI Open-Source Machine Learning Tools AI Ethics and Diversity Data Visualization and Processing Algorithm Development and Evaluation

The hottest Substack posts of MLOps Newsletter

And their main takeaways

Google announces AI system for diagnostic medical reasoning and conversation

176 implied HN points • 20 Jan 24

Google announced an AI system for medical diagnosis and conversation called AMIE.
AMIE's architecture includes multi-turn dialogue management, hierarchical reasoning model, and modular design.
The AI system AMIE showed promising performance in simulated diagnostic conversations, outperforming PCPs and matching specialist physicians.

Monarch Matrices(M2) instead of Transformers?

176 implied HN points • 14 Jan 24

🕹 Technology AI Machine Learning Data Libraries

Monarch Matrices (M2) are proposed as a replacement for Transformers in models.
M2 uses structured Monarch matrices to improve efficiency in capturing relationships and reduce computational costs.
Replacing attention and MLPs with Monarch matrices in M2 enhances model performance and simplifies learning parameters.

TikTok's Recommendation Engine Explained: Monolith

157 implied HN points • 30 Jul 23

🕹 Technology AI Machine Learning Libraries Models

TikTok's recommendation system is designed to give real-time suggestions by using sparsity-aware factorization machines, online learning, and caching.
Multimodal deep learning focuses on text-image modeling due to lack of large annotated datasets for other modalities like video and audio.
A new framework called Parsel enables automatic implementation of complex algorithms with code language models, leading to better problem-solving results in competitions.

Modular Deep Learning

78 implied HN points • 27 Jan 24

🕹 Technology Deep Learning Generative models Classes

Modular Deep Learning proposes splitting models into smaller, independent modules for specific subtasks.
Modularity in AI development can lead to collaborative and efficient ecosystem and democratize AI development.
PyTorch 2.0 introduces performance gains such as faster inference and training speeds, autotuning, quantization, and improved memory management.

ChatGPT4 still leads ChatBot/LLM Leaderboard

137 implied HN points • 16 Jul 23

🕹 Technology AI Programming Machine Learning Data Management Online Learning

ChatGPT4 is leading the ChatBot/LLM Leaderboard
State of GPT series models evolution discussed
Introduction of LeanDojo for open-source Lean playground

Get a weekly roundup of the best Substack posts, by hacker news affinity:

LLM Inference Made Easy, TSMixer for Time Series

98 implied HN points • 14 Oct 23

🕹 Technology Machine Learning Libraries Inference Frameworks

LLMs require memory bandwidth and batching for efficient inference
Best practices for LLM inference include batching, quantization, and model parallelism
Different machine learning models like linear regression and random forests are used in models such as Juggler for ranking and satisfaction predictions

Pinterest improves their Closeup Recommendation System through foundational changes

98 implied HN points • 07 Oct 23

🕹 Technology AI Data ML Models Libraries

Pinterest improved their Closeup Recommendation System with foundational changes like hybrid data logging and sampling.
Pinterest uses a model refreshing framework to keep their Closeup Recommendation model up-to-date and adaptable.
Distilling step-by-step can help train smaller, more efficient, and interpretable language models like LLMs.

Graph Neural Networks in Tensorflow

39 implied HN points • 10 Feb 24

🕹 Technology LLMs Chatbots Frameworks Models

Graph Neural Networks in TensorFlow address data complexity, limited resources, and generalizability in learning from graph-structured data.
RadixAttention and Domain-Specific Language (DSL) are key solutions for efficiently controlling Large Language Models (LLMs), reducing memory usage, and providing a user-friendly interface.
VideoPoet demonstrates hierarchical LLM architecture for zero-shot learning, handling multimodal input, and generating various output formats in video generation tasks.

XGen, a 7B LLM trained on up to 8K sequence length from SalesForce

78 implied HN points • 05 Aug 23

🕹 Technology AI/ML Weather Climate Models Libraries

ClimaX is a deep learning model designed for weather and climate tasks like forecasting temperature and predicting extreme weather events.
XGen is a 7B LLM trained on up to 8K sequence length, achieving state-of-the-art results in tasks like MMLU, QA, and HumanEval.
GPT-4 API from OpenAI provides easy access to a powerful language model capable of generating text, translating languages, and answering questions.

Exphormer(Graph Neural Networks)

39 implied HN points • 04 Feb 24

🕹 Technology Machine Learning Neural Networks Optimization Deep Learning Library

Graph transformers are powerful for machine learning on graph-structured data but face challenges with memory limitations and complexity.
Exphormer overcomes memory bottlenecks using expander graphs, intermediate nodes, and hybrid attention mechanisms.
Optimizing mixed-input matrix multiplication for large language models involves efficient hardware mapping and innovative techniques like FastNumericArrayConvertor and FragmentShuffler.

Kaggle AI Report, Pinterest's Diverse Recommendation System

58 implied HN points • 29 Oct 23

🕹 Technology AI Programming Language Models Libraries Algorithm

Kaggle published an AI report for 2023 covering key areas like Generative AI and AI ethics.
Pinterest uses models to ensure diverse search results by identifying attributes like skin tone and body type.
Libraries like Arckit and LMQL provide tools for data visualization and working with large language models.

Embedding as a Service(EaaS) vs. Model as a Service(MaaS)

58 implied HN points • 06 Aug 23

🕹 Technology AI Deployment Comparison

Embedding as a Service (EaaS) provides access to pre-trained embeddings for tasks like NLP and is easy to use.
Model as a Service (MaaS) offers pre-trained models for tasks like image classification and can be more accurate but may be more expensive.
EaaS is cost-effective and offers flexibility, while MaaS provides models with higher accuracy and interpretability.

RLHF made easy: AlpacaFarm

58 implied HN points • 03 Jun 23

🕹 Technology Frameworks Libraries Conferences

Stanford introduced AlpacaFarm for making RLHF accessible, quick, and cost-effective.
Google presented Plex, a framework for reliable deep learning model architectures.
Various libraries and tools such as Guidance, LMQL, and Open-Llama are available for enhancing language models and AI technologies.

NVIDIA announces TensorRT LLM to make LLM Inference easy(on H100!)

58 implied HN points • 24 Sep 23

🕹 Technology AI Models Machine Learning Data processing Text Analysis Speech Recognition

NVIDIA introduces TensorRT LLM for faster LLM inference on H100 GPUs
Google develops Inverse Reinforcement Learning method for training AI to mimic human behavior
Pinterest uses Ray framework for faster data processing in its pipeline

Stanford CRFM suggests moving to workflows rather than tasks for evaluation!

58 implied HN points • 04 Sep 23

🕹 Technology Language Models Quantization AI Tools

Stanford CRFM recommends shifting ML validation from task-centric to workflow-centric for better evaluation
Google introduces Ro-ViT for pre-training vision transformers, improving on object detection tasks
Google AI presents Retrieval-VLP for pre-training vision-language models, emphasizing retrieval to enhance performance

Batch Calibration for LLMs

39 implied HN points • 21 Oct 23

🕹 Technology Artificial Intelligence Machine Learning Coding Software Development Research

Flash-Decoding optimizes attention to speed up decoding of Large Language Models (LLMs).
Batch Calibration (BC) is a new zero-shot calibration method for LLMs, improving accuracy without labeled data.
MiniGPT-v2 introduces unique identifiers for tasks, enhancing performance on vision-language tasks.

Google open-sources Vizier

39 implied HN points • 20 Feb 23

🕹 Technology Machine Learning AI Data Engineering

Google open-sourced their blackbox optimization library named Vizier for reliable tuning and optimization.
Pinterest introduced Lightweight Ranking to recommend Pins with better relevance and build scalable ML models.
Netflix uses ML to predict Out of Memory issues in production, overcoming data engineering challenges like structuring data.

OpenAI launches GPT-4!

39 implied HN points • 19 Mar 23

🕹 Technology AI APIs Frameworks Models Libraries

OpenAI has launched GPT-4, a significant improvement over GPT-3 and ChatGPT
GPT-4 has capabilities like academic success, steerability, and processing visual inputs
OpenAI has introduced Whisper and ChatGPT APIs for commercial use cases

LLM Stack, Controllable Generative Models

39 implied HN points • 02 Jul 23

🕹 Technology Machine Learning Generative models Large Language Models APIs Libraries

Gorilla model surpasses GPT-4 in writing API calls
Anticipatory Music Transformer allows controlled music generation
HyenaDNA sets new standard in genomics with long-range model

Twitter open-sourced their recommendation algorithm

39 implied HN points • 09 Apr 23

🕹 Technology Algorithms Machine Learning Open Source Neural Networks Data science

Twitter has open-sourced their recommendation algorithm for both training and serving layers.
The algorithm involves candidate generation for in-network and out-network tweets, ranking models, and filtering based on different metrics.
Twitter's recommendation algorithm is user-centric, focusing on user-to-user relationships before recommending tweets.