The hottest Data science Substack posts right now

And their main takeaways

[in case you missed it] Data Science Weekly - Issue 437

Data Science Weekly Newsletter • 19 implied HN points • 10 Apr 22

🕹 Technology Data science

Distribution shift is a big challenge in machine learning. If we ignore how data changes in the real world, our models may fail.
Tech apprenticeships are becoming more common and are a great way to learn while earning money. They help people start new careers in tech, even without a degree.
There's ongoing research to give computers common sense. This could help AI understand the world better and make smarter decisions.

Data Science Weekly - Issue 437

Data Science Weekly Newsletter • 19 implied HN points • 07 Apr 22

🕹 Technology Data science

Data in the real world can change, and we need to think about that when we use machine learning. If we don't, our models may not work well when they are put to the test.
Attending conferences can be a great way to learn and connect with others in the field. They often showcase new startups and many interesting themes that can inspire ideas.
Tech apprenticeships are a rising opportunity. They allow you to earn while you learn skills for a technology career, making it accessible for more people.

E5 - Roles and Responsibilities of an AI Product Manager

The Product Channel By Sid Saladi • 6 implied HN points • 08 Dec 24

🕹 Technology Data science

AI product managers play a key role in creating and managing AI-powered products. They need to combine technical knowledge with an understanding of user needs.
Their responsibilities include researching AI applications, creating product strategies, and leading development teams. They ensure that products are both viable in the market and valuable to users.
To succeed, AI product managers should have skills in AI, business, and user experience. A mix of education in tech, business, and design helps prepare them for this role.

Data Science Weekly - Issue 436

Data Science Weekly Newsletter • 19 implied HN points • 31 Mar 22

🕹 Technology Data science

Aggregating data can hide important details and context. It's better to focus on specific aspects of the data to find deeper insights.
Waymo is testing fully autonomous vehicles in San Francisco. This effort aims to integrate self-driving technology into everyday life for its employees.
AI can help improve representation on platforms like Wikipedia. A new approach is being developed to ensure more diverse biographies are created.

Data Science Weekly - Issue 435

Data Science Weekly Newsletter • 19 implied HN points • 24 Mar 22

🕹 Technology Data science

Algorithmic assessments can help ensure that healthcare technology benefits everyone involved. It's important to evaluate how data is used in these systems.
Relying solely on deep learning for electronic medical records may not be the best idea right now. Instead, better IT support is needed to improve healthcare systems.
Many claims about explaining AI technology are misleading. Experts agree that what we currently call 'explainable AI' often falls short of being truly understandable.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Data Science Weekly - Issue 434

Data Science Weekly Newsletter • 19 implied HN points • 17 Mar 22

🕹 Technology Data science

Understanding NLP is important. It involves tokenization and encoding, which helps to improve how machines understand language.
Performance in deep learning can often feel random, but reasoning from first principles can help simplify the process. Focus on compute, memory, and overhead to improve performance.
There is a growing need for data product managers as data teams modernize. These managers bridge the gap between data science insights and product development.

Data Science Weekly - Issue 433

Data Science Weekly Newsletter • 19 implied HN points • 10 Mar 22

🕹 Technology Data science

Deep learning is facing challenges, and experts are exploring what it needs to improve. It's important for AI to overcome these hurdles to progress further.
MLOps, or machine learning operations, is currently complicated, but it's a growing field that promises future innovations. New tools and methods are emerging rapidly, making it tricky for newcomers to find their way.
Visualizing data effectively is essential for making sense of complex information. Standards are being developed to help create better visuals, which makes it easier for everyone to understand data.

Data Science Weekly - Issue 432

Data Science Weekly Newsletter • 19 implied HN points • 03 Mar 22

🕹 Technology Data science

AI art has evolved quickly, becoming more relatable and controllable thanks to advancements in technology. Many people, even experts, are surprised by how realistic and detailed AI-generated images can now be.
Conversational agents, like chatbots, are becoming more common and can serve different purposes, from casual chats to helping users complete specific tasks. However, understanding their impact on society is important as they become more integrated into daily life.
The CX-ToM framework improves explainable AI by creating a dialogue between machines and humans for better understanding. This approach focuses on the intentions of both the user and the machine, making AI decisions clearer.

Fractal, GPT3 for CV & our Vox-pop 🚕🎡🎨

Sector 6 | The Newsletter of AIM • 19 implied HN points • 09 Jan 22

🕹 Technology Data science

Fractal technology is playing a big role in AI discussions and developments.
GPT-3 is being used for computer vision, which is exciting for tech advancements.
There is a new forum for sharing data science questions and getting help from the community.

Data Science Weekly - Issue 431

Data Science Weekly Newsletter • 19 implied HN points • 24 Feb 22

🕹 Technology Data science

Vector databases are important for storing and searching data in various applications like image search and drug discovery.
Statistics may not be the best path to becoming a data scientist; other fields could be more relevant and useful.
Teaching and practicing reproducible workflows in data science helps ensure that research and findings can be verified and built upon.

✅ Monday briefing: Word play, training new talent, agency globalisation stalls, data science analysis, levelling up, ad platform data failure, web accessibility, and more…

Wadds Inc. newsletter • 19 implied HN points • 07 Feb 22

💼 Business Data science

There's a new training program called 3THINKR'S Academy that offers free sessions for people starting their careers in communications. It's a great chance to learn skills like time management and social media strategy.
Big advertising agencies are slowing down their global expansion plans. Many are focusing more on local markets due to challenges like supply chain issues and political situations.
Facebook experienced a decline in daily users for the first time recently, partly because of competition from platforms like TikTok. This has worried investors about the future of the company's ad revenue.

Data Science Weekly - Issue 430

Data Science Weekly Newsletter • 19 implied HN points • 17 Feb 22

🕹 Technology Data science

Data businesses are important but not well-studied, and understanding their models can help in a tech-focused market.
Investors are focusing on machine learning and its challenges, which can show opportunities for startups in that field.
Machine learning is evolving, especially with advances in compute requirements, which are becoming crucial for training complex models.

Data Science Weekly - Issue 429

Data Science Weekly Newsletter • 19 implied HN points • 10 Feb 22

🕹 Technology Data science

Data science models need regular monitoring after deployment. They can lose effectiveness over time, so it's important to keep an eye on their performance.
Recommender systems help users find relevant content among large amounts of data. They are essential tools for platforms like YouTube and Facebook.
Causal knowledge is important for making good business decisions. Relying solely on prediction-based methods may not address complex managerial problems.

The war for large language models

Sector 6 | The Newsletter of AIM • 19 implied HN points • 19 Dec 21

🕹 Technology Data science

DeepMind has released a new language model called Gopher with 280 billion parameters. This shows how competitive the field of AI is getting.
Google followed with its own model called GLaM, which is even larger at 1.2 trillion parameters. These advancements highlight the rapid progress in AI technology.
Both companies are pushing the boundaries of what large language models can do, using innovative techniques to improve performance and efficiency. It's exciting to see how these developments will shape the future of AI.

Data Science Weekly - Issue 428

Data Science Weekly Newsletter • 19 implied HN points • 03 Feb 22

🕹 Technology Data science

Information Theory has evolved over time, influenced by technology and significant events like the space race, shaping its focus and impact across various fields.
DeepMind's AlphaCode can compete in programming challenges, showing how AI can be developed to solve complex problems requiring a mix of skills.
Understanding the concept of typicality is important in generative models, as it helps clarify issues with common methods like beam search and anomaly detection.

Data Science Weekly - Issue 427

Data Science Weekly Newsletter • 19 implied HN points • 27 Jan 22

🕹 Technology Data science

Using offline replay experimentation can help predict results faster, cutting down the time usually needed for online experiments.
Bad data can seriously affect business operations, and understanding how it breaks is crucial for fixing dashboards and reports.
Shapley values can explain machine learning models by distributing how each feature contributes to predictions, making the model's decisions clearer.

✳✴❇ Behavioural Science, Better Data Engineering & Timnit Gebru’s New Institute

Sector 6 | The Newsletter of AIM • 19 implied HN points • 05 Dec 21

🕹 Technology Data science

Behavioral science can improve how data engineering is done. Understanding how people think and behave helps create better tech solutions.
There’s a new hackathon for data scientists featuring a challenge to predict loan defaults. It has already attracted over 1,000 participants.
A conference for machine learning developers will be held in-person in Bangalore. It's a great opportunity to learn and connect with others in the field.

Slides for my talk at PyData London 2023

Laszlo’s Newsletter • 21 implied HN points • 05 Jun 23

🕹 Technology Data science

The talk discussed Code Smells in Data Science and solutions
Importance of code readability and establishing a culture
Slides available for download for further reference

Data Science Weekly - Issue 426

Data Science Weekly Newsletter • 19 implied HN points • 20 Jan 22

🕹 Technology Data science

Prospective learning is important because it focuses on preparing for future challenges instead of just learning from past experiences. This helps both humans and AI to adapt to new situations better.
AI is set to change the field of medicine greatly, making things better for both doctors and patients by improving medical tools and approaches. But there are important ethical and technical issues to consider, like data fairness and bias.
Using vectorization can speed up Python code significantly, but it's essential to understand what it means and when to apply it. This way, you can handle large sets of data more efficiently.

Data Science Weekly - Issue 425

Data Science Weekly Newsletter • 19 implied HN points • 13 Jan 22

🕹 Technology Data science

Be careful when joining a data or tech team; look for warning signs that could mean trouble. It's important to ensure a good fit for your career.
The AI job market is constantly changing, so it's good to stay informed and adapt your strategies for landing jobs in this field.
Transformers are now widely used in natural language processing and are also making their way into computer vision, making it important to understand how they work.

The Intricate Link Between Compression and Prediction

Mindful Modeler • 3 HN points • 24 Oct 23

🕹 Technology Data science

K-nearest neighbors with compressed documents can outperform deep learning models for text classification.
Compression and prediction are closely linked - a good theory about the world can be both compressed and predicted well.
Good predictors can also be good compressors; models like language models act as compressors while predicting.

September Newsletter

RSS DS+AI Section • 17 implied HN points • 04 Sep 23

🕹 Technology Data science

September newsletter focuses on industrial strength data science and AI
Committee activities include surveys, alliances, conference participation, and content highlights
Topics covered in the newsletter range from ethics and generative AI to research developments and practical tips

Data Science Weekly - Issue 424

Data Science Weekly Newsletter • 19 implied HN points • 06 Jan 22

🕹 Technology Data science

New data science managers have a lot to learn in their first year. They should focus on gaining experience and reflecting on their journey to improve their skills.
Chatbots still struggle with understanding complex human queries. They often provide confusing answers because they lack real-world comprehension.
Real-time machine learning is a growing trend with unique challenges. Companies are talking about their pain points and seeking practical solutions for online predictions and continual learning.

Data Science Weekly - Issue 423

Data Science Weekly Newsletter • 19 implied HN points • 30 Dec 21

🕹 Technology Data science

2021 was a great year for AI research, with many new papers and breakthroughs that need to be understood and followed up on.
Graph machine learning gained a lot of attention, and there are many new trends and advancements worth knowing about.
There are many resources and tools available for learning data science and machine learning, including free courses and beginner-friendly tutorials.

Data Science Weekly - Issue 422

Data Science Weekly Newsletter • 19 implied HN points • 23 Dec 21

🕹 Technology Data science

Games can be made within spreadsheets like Excel or Google Sheets, making learning fun and interactive.
Testing is an important part of a data scientist's job, and understanding how to do it can help improve analysis work.
Understanding language can help in developing smarter machines, opening new paths for machine learning beyond just text processing.

Nerds are better investors

Klement on Investing • 2 implied HN points • 05 Jun 25

💰 Finance Data science

More companies are hiring data scientists to help with investment decisions. This often leads to better returns for those companies.
Hiring data scientists can help firms focus more on specific investments, which improves their insight and portfolio performance.
However, too much reliance on data scientists can make the stock market less efficient, leaving room for traditional analysts to find good investment opportunities.

Multilingual Embeddings, Safer LLMs, and Log-Linear Attention

ppdispatch • 2 implied HN points • 13 Jun 25

🕹 Technology Data science

There's a new multilingual text embedding benchmark called MMTEB that covers over 500 tasks in more than 250 languages. A smaller model surprisingly outperforms much larger ones.
Saffron-1 is a new method designed to make large language models safer and more efficient, especially in resisting attacks.
Harvard released a massive dataset of 242 billion tokens from public domain books, which can help in training language models more effectively.

December RSS AI and Data Science newsletter - anything to contribute?

RSS DS+AI Section • 5 implied HN points • 21 Nov 24

🕹 Technology Data science

The next newsletter for AI and Data Science will be sent out in early December.
Contributors can include announcements about meetups, job openings, or publications.
People should send their contributions directly to the author's email, not by replying to the newsletter.

Data Science Weekly - Issue 421

Data Science Weekly Newsletter • 19 implied HN points • 16 Dec 21

🕹 Technology Data science

Lee Wilkinson made a big impact in the field of interactive visualization. His work helped people better understand and create statistical graphics.
A new journal for machine learning research is starting, aiming for quick and fair reviews. This will help share cutting-edge research in a transparent way.
Feature engineering is still important in machine learning, despite the rise of deep learning. It turns out that creating good features can really boost model performance.

March Newsletter

RSS DS+AI Section • 11 implied HN points • 01 Mar 24

🕹 Technology Data science

The newsletter discussed various updates and activities in the field of data science and AI, including committee activities, advancements in research, and real-world applications.
Ethical considerations, bias, diversity, regulation, and safety in AI and data science were highlighted as hot topics in the newsletter, with examples of AI-related consequences and efforts to improve safety.
The newsletter also featured practical tips, how-to guides, and bigger picture ideas in the field, providing a broad range of information for data science practitioners.

Data Science Weekly - Issue 420

Data Science Weekly Newsletter • 19 implied HN points • 09 Dec 21

🕹 Technology Data science

D3 is a powerful tool for data visualization that has lasted over a decade. Its success is attributed to its flexibility and the community support it receives.
Building AI models like open-source software can make these models better and more collaborative. This means involving a wider community in their development.
Automated decision-making systems can still reflect human biases, which shows that technology doesn't always solve fairness issues.

How AI generates images, visually explained 🎨

Year 2049 • 4 implied HN points • 20 Jan 25

🕹 Technology Data science

AI creates images using a process called diffusion. This means it starts with random noise and turns it into a clear image step by step.
Understanding how AI generates images helps demystify some of the technology behind AI and art. It's cool to see how computers can make creative expressions!
Learning about AI can open up more conversations about its impact on our everyday lives and the future of creativity. It's important to think about both the benefits and challenges.

Data Science Weekly - Issue 419

Data Science Weekly Newsletter • 19 implied HN points • 02 Dec 21

🕹 Technology Data science

FluxML is teaming up with NumFOCUS to enhance open science and improve machine learning tools. This partnership will support new applications in areas like scientific machine learning and differentiable programming.
There’s a fun 30-day challenge involving mapping that highlights the importance of community in data science. It celebrates collaboration and innovation in creating visual representations of data.
AI is making strides in pure mathematics by helping uncover new patterns and insights. This collaboration between AI and mathematicians could lead to significant advancements in understanding complex mathematical concepts.

Data Science Weekly - Issue 418

Data Science Weekly Newsletter • 19 implied HN points • 25 Nov 21

🕹 Technology Data science

Understanding data strategy is crucial for companies. Many invest in data, but few create a data-driven culture.
Deep learning can help with smart, autonomous systems, but caution is needed in safety-critical applications.
Tools like Retool make it easier for teams to build applications on their data without needing extensive coding skills.

Data Science Weekly - Issue 417

Data Science Weekly Newsletter • 19 implied HN points • 18 Nov 21

🕹 Technology Data science

Brains are like prediction machines which help save energy. They do this by predicting what they will perceive in the world around them.
AI is being used to help scientists study chimpanzee behavior in the wild. It can find important clips in hours of footage much faster than humans.
Different approaches to AI governance exist between the EU and the US. This may affect how they collaborate on AI in the future.

Data Science Weekly - Issue 416

Data Science Weekly Newsletter • 19 implied HN points • 11 Nov 21

🕹 Technology Data science

Mature machine learning systems can be tough to improve. Even with cutting-edge technology, you might find that new models don't perform better than old ones.
Data drift and outlier detection are important for monitoring ML models. They help identify issues when you lack ground truth labels to compare against.
Language models score how 'human' a sentence sounds. To train these models, you can analyze and convert language into probabilities.

February Newsletter

RSS DS+AI Section • 11 implied HN points • 03 Feb 24

🕹 Technology Data science

Committee activities include expanding accreditation and organizing sessions for RSS International Conference
Ongoing focus on ethics, regulation, and AI-generated content in data science and AI research
Exciting developments in research, including advancements in large language models, generative AI, and real-world applications

Data Science Weekly - Issue 415

Data Science Weekly Newsletter • 19 implied HN points • 04 Nov 21

🕹 Technology Data science

Audio signal processing is important for machine learning projects that involve sound. To analyze sound effectively, you need to convert it into spectrograms first.
Algorithmic efficiency in deep learning has improved greatly, requiring much less computing power than before. This means we can train complex neural networks faster and more efficiently now.
Understanding Gaussian processes can be complicated, but looking at them in different ways can help. Each perspective gives new insights and makes the concept easier to grasp.

Data Science Weekly - Issue 414

Data Science Weekly Newsletter • 19 implied HN points • 28 Oct 21

🕹 Technology Data science

Machine learning can work with messy data. The key is to adapt techniques to handle things like missing values instead of spending all the time cleaning the data.
Visualizations should be clear and focused. Good designs help people understand the information better by removing clutter and emphasizing main points.
There are emerging tools and techniques that can speed up scientific discovery through faster machine learning methods. This helps researchers process data in real time and make new discoveries.

Data Science Weekly - Issue 413

Data Science Weekly Newsletter • 19 implied HN points • 21 Oct 21

🕹 Technology Data science

AI can help create music, but it raises questions about artistic value and originality. It's a mix of excitement and skepticism over how machines understand creativity.
Learning practical tools in computer science, like command-line and version control, is often overlooked in traditional classes. A new course aims to fill this gap by teaching these essential skills.
When developing AI models, it’s important to think about their impact and safety in real-world applications. There are challenges in ensuring these models are ethical and reliable.