The hottest Data Privacy Substack posts right now

And their main takeaways

Three Considerations For Private Open-Source LLM Instances

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 29 Apr 24

🕹 Technology AI Software Open Source Data Privacy Machine Learning

Large Language Models (LLMs) can struggle with performance over time. This problem affects apps that depend on commercial LLM APIs, leading to inconsistencies in how these applications work.
Catastrophic forgetting is a challenge where LLMs forget earlier learned information when they learn new data. This can cause issues when the model is asked to understand broad topics.
Hosting your own open-source LLMs gives your organization more control. You can manage updates, training, and data privacy, making your applications more secure and tailored to your needs.

Apple's AI Strategy in a Nutshell

Enterprise AI Trends • 43 HN points • 11 Jun 24

🕹 Technology AI Trends Data Privacy Voice Assistants

Apple is taking AI seriously and has built its own data center to support its AI projects. This means they have more control and can create better AI experiences for users.
Apple's Siri is expected to become more useful with new features that allow it to perform tasks hands-free, which could lead to a significant increase in AI usage among everyday people.
Apps may struggle to get noticed as Siri might execute tasks without users needing to open them. This could limit how users interact with individual applications.

Must Learn AI Security Part 10: Backdoor Attacks Against AI

Rod’s Blog • 79 implied HN points • 08 Sep 23

🕹 Technology AI Security Cybersecurity Machine Learning Data Privacy Threat Detection

A backdoor attack against AI involves maliciously manipulating an artificial intelligence system to compromise its decision-making process by embedding hidden triggers.
Different types of backdoor attacks include Trojan attacks, clean-label attacks, poisoning attacks, model inversion attacks, and membership inference attacks, each posing unique challenges for AI security.
Backdoor attacks against AI can lead to compromised security, misleading outputs, loss of trust, privacy breaches, legal consequences, financial losses, highlighting the importance of securing AI systems with strategies like vetting training data, robust architecture, and continuous monitoring.

Tracking/Measurement/Collection/Creation - what was the question again?

timo's substack • 78 implied HN points • 26 Mar 23

🕹 Technology Data Engineering Analytics Data Privacy Data Collection

Finding a niche involves identifying what you enjoy and what is consistently needed in your projects.
Tracking data is easily understood, but may have a negative reputation due to its association with web tracking practices.
Measurement is a broader term than tracking, and data collection is often overlooked in the data engineering process.

How much would you pay not to use Twitter? Or vice-versa?

Social Warming by Charles Arthur • 78 implied HN points • 19 May 23

🕹 Technology Social media AI Apps Subscription Models Data Privacy

Consider how much you would pay or what special features you would require to access social networks like Facebook, Twitter, TikTok, Instagram, or Snapchat.
Offering the right features for paid subscriptions is crucial for social networks to succeed, as seen in the example of Twitter Blue.
Understanding what users are willing to pay for on social networks is important, especially as the industry shifts towards freemium models.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Deconstructing the National Cybersecurity Strategy

Deploy Securely • 78 implied HN points • 03 Mar 23

🕹 Technology Cybersecurity Regulation Government Policy Data Privacy Business strategy

The National Cybersecurity Strategy emphasizes the need for businesses to adapt their cybersecurity strategies accordingly.
The strategy addresses the importance of defending critical infrastructure and the need to streamline cybersecurity regulations.
Business leaders should be aware of potential regulatory changes impacting software security and consider the implications of a national cyber insurance backstop.

Artificial Intelligence Breaches the Fundamental Human Right of Choice

Theology • 11 implied HN points • 10 Feb 25

🕹 Technology AI Ethics Digital Rights Data Privacy Human Rights

Big Tech is forcing AI into our lives without giving us a choice. Instead of letting people decide if they want to use AI, companies are making it hard to opt-out.
The right to choose whether we use AI is a fundamental human right. People should have clear options and be informed about how AI affects their choices.
Society needs to push for laws that protect our rights related to AI. Just like privacy laws protect our data, we need rules to keep AI as a choice, not something that's forced on us.

Tell GPT It Can Scrape My -

Permit.io’s Substack • 3 HN points • 09 Aug 24

🕹 Technology AI Ethics Data Privacy Web Development Software Tools Content creation

Many creators are worried about how AIs use their work without permission. This can lead to sharing sensitive data and violating privacy laws.
It's important to identify and rank who is accessing application data, including distinguishing between human users and automated bots.
Users should have control over their own data. They need easy ways to set permissions for who can access their content and under what conditions.

The Deere becomes the Hunter

Easy Observations • 39 implied HN points • 24 Jan 24

💼 Business Data Privacy Market Trends

The agriculture industry is slow to adopt new technology and most farmers do not have access to high-tech features.
John Deere dominates the precision agriculture hardware space and aims to control the flow of agriculture data.
Deere's strategy involves integrating their technology with competitors' machinery through APIs to establish themselves as the central player in Ag Data.

Confidential AI: The Dog That Didn't Bark In The Night

State of the Future • 29 implied HN points • 05 Nov 24

🕹 Technology AI Data Privacy Machine Learning Crypto Infrastructure

We need to prioritize data privacy as AI gets more personal. New technologies could help us protect our information while still allowing AI to learn.
Building fair and unbiased AI models is crucial, as biased models can worsen social inequalities. We have tools to help create better AI that considers everyone fairly.
There's a big opportunity to use decentralized systems for AI training and inference. This could make AI more accessible and less dependent on a few large companies.

A Few Thoughts about O1

AI Research & Strategy • 2 HN points • 12 Sep 24

🕹 Technology Artificial Intelligence Machine Learning Software Development Internet Data Privacy

The new O1 models from OpenAI show impressive results, but they can't be fairly compared to earlier models because they use a different reasoning process.
OpenAI's O1 models are not meant to replace older models entirely and require a system to decide when to use them, which could complicate things.
OpenAI has a controversial pricing strategy, where users might pay for features they can't fully see or understand, raising concerns about transparency.

Maine Micdrop: Auto Right to Repair Wins 84% Support At Polls

Fight to Repair • 59 implied HN points • 10 Nov 23

🕹 Technology Repairability Ecosystem Sustainability Policy Data Privacy

Maine voters strongly support the right to repair automotive vehicles, mirroring efforts in other states. Voting yes on Question 4 allows car owners to choose where they get their vehicles repaired.
Denver's Waste No More initiative promotes deconstruction over demolition to recycle and reuse construction materials, reducing landfill waste and lowering carbon footprint. Transitioning to deconstruction on a large scale faces challenges.
Recognizing the environmental impact of construction waste, Denver residents passed the Waste No More ballot initiative. The ordinance requires the separation and recycling of several materials in construction and demolition activities.

How to Fake Decryption

Nonsense on Stilts • 1 HN point • 04 Sep 24

🕹 Technology Cryptography Cybersecurity Information Theory Computer Science Data Privacy

You can create a fake key and a fake message to trick someone into thinking they decrypted a message. This lets you mislead anyone watching your communication.
It's important to plan what the fake message will be before sending the real one, so both parties know what to expect if asked.
This technique could be used for serious purposes, like hiding important communications, or just for fun in games and stories.

On Techno-pragmatism (part 2)

ailogblog • 39 implied HN points • 07 Jan 24

🕹 Technology AI Ethics Futurism Social Impact Data Privacy

Engineers tend to be empiricists at work but lean towards idealism in considering the social value of their work, showing a need for a balance between pragmatism and idealism in their mindset.
Probabilistic thinking is valuable for navigating uncertainties about the future, allowing for updating beliefs based on new information like in poker or medical diagnosis.
Pragmatism offers a mediating force that combines pluralism and religiosity into a faith in democratic action, providing a balanced approach in a polarized world.

Why TikTok ‘Knows You’: The Data Trick That Makes It Tick

The Daily Bud • 12 implied HN points • 25 Jan 25

🕹 Technology Social media Algorithms Data Privacy User Experience Machine Learning

TikTok's algorithm is really good at guessing what you want to watch next. It keeps improving by watching how you interact with videos.
Unlike other apps, TikTok avoids mixing user data, which helps keep recommendations super personal. This means you get content that's more tailored just for you.
The way TikTok designs its data storage prevents recommendations from getting mixed up. This leads to a cleaner and more enjoyable experience while using the app.

The Need for Speed | 2025 Engineering Benchmarks

Dev Interrupted • 14 implied HN points • 21 Jan 25

🕹 Technology Software Development Engineering Productivity Data Privacy Artificial Intelligence

Smaller pull requests can increase both speed and quality of software development. This helps teams work faster without compromising standards.
Longer cycle times often lead to more errors and project failures. It's essential to keep cycle times short to maintain software quality.
Investing in developer experience (DevEx) is important for a team's productivity. If you don't invest enough, unexpected work and issues can slow down progress.

Yale University vows to 'geolocate' most EJMR users [PART 2]

Karlstack • 197 implied HN points • 18 Jul 23

🕹 Technology Cybersecurity Internet Data Privacy Academic Research Social media

Yale University used a complex procedure to obtain IP addresses of EJMR users.
There are concerns about privacy and legal implications raised by the leaked information.
The leaked slides and paper are causing a stir, especially among academia and legal professionals.

Why I want an AI File Explorer

Cosmos • 39 implied HN points • 31 Dec 23

🕹 Technology AI Data Privacy File Management

AI File Explorer can use AI to analyze, tag, search, and organize files based on their contents, freeing users from manual tagging.
Data stored on cloud services may pose privacy and accessibility challenges for using AI on personal files.
Next-generation file explorers, like Cosmos, offer privacy-focused AI solutions, emphasizing user control over data and experimenting with Small Language Models.

Must Learn AI Security Compendium 10: Challenges of Enhancing AI Language Models with External Knowledge

Rod’s Blog • 59 implied HN points • 12 Oct 23

🕹 Technology AI Security Machine Learning Software Data Privacy

Retrieval-Augmented Generation (RAG) enhances AI language models by combining them with external knowledge sources, improving the quality and accuracy of generated responses.
RAG offers benefits such as access to current information, increased contextual understanding, and reduced risk of incorrect data, but it also comes with challenges like data integration and semantic relevance.
The future of RAG includes developments like fine-grained relevance ranking, domain-specific knowledge bases, real-time updates, and ethical considerations to ensure responsible use.

It Started with a Cable

Technically Optimistic • 59 implied HN points • 29 Sep 23

🕹 Technology Tech news AI Regulation Data Privacy Digital innovation Tech Policy

Technological advancements deeply influence society.
Engaging more people in conversations about technology is crucial.
Regulations, like the EU's push for standard charging ports, can have significant global impacts.

🕵️🗺️ Where do I deploy Llama-2? 🦙🦙

LLMs for Engineers • 59 implied HN points • 22 Aug 23

🕹 Technology Cloud Computing Artificial Intelligence Machine Learning Data Privacy Cost Analysis

There are many options for hosting Llama-2, including big names like AWS, GCP, and Azure, as well as newer providers like Lambda Labs and CoreWeave. Each has its own pricing and GPU options.
Understanding how much you plan to use Llama-2 is important. This helps you decide whether to use a cloud service provider or a function-based option like Replicate.
Cost-effectiveness varies with different providers. For low usage, function providers can be cheaper, but for higher usage, CSPs might save you money in the long run.

Ways to solve the data user identity & privacy crisis

timo's substack • 58 implied HN points • 08 May 23

🕹 Technology Data Privacy Product Analytics

Consider alternative approaches to using user IDs for data privacy
Think about using aggregated identifiers like account or team IDs for analytics
When tracking, prioritize user privacy by disabling cookie tracking and providing user IDs with event calls

Protecting Patient Privacy with AnonCAT

AI for Healthcare • 58 implied HN points • 26 Apr 23

🏥 Health & Wellness AI Data Privacy

Protecting patient privacy involves removing or masking Personal Health Information (PHI)
AI models should not learn from identifiable data to ensure patient privacy
Deep learning models like AnonCAT offer an adaptable solution for accurately redacting Electronic Health Records

EXCLUSIVE: Every economist on Mastodon just had their anonymity compromised by hackers from Yale University 🍿🍿🍿

Karlstack • 183 implied HN points • 22 Jul 23

🕹 Technology Data Privacy Cybersecurity Academic Research

Anonymity of economists on Mastodon compromised by hackers from Yale University
Yale's connection to EJMR and Mastodon raises privacy concerns
Controversy around doxxing and ethical considerations in academic online forums

Secure Machine Learning

Gradient Flow • 199 implied HN points • 16 Jun 22

🕹 Technology Machine Learning Data Privacy Open Source Business Intelligence

Data privacy and security are crucial in machine learning, especially while data is being used; a new open-source library is making Secure Multi-Party Computation more accessible.
Business Intelligence tools help non-programmers analyze data for strategic decisions, with modern tools allowing for advanced analytics and modeling capabilities.
Identifying data startups with real market traction is essential; choosing companies founded post-2006 coincides with the rise of big data technology like Hadoop.

Cloud Shared Responsibility Model: Time for an (R)Evolution?

Resilient Cyber • 119 implied HN points • 27 Mar 23

🕹 Technology Cloud Computing Cybersecurity Data Privacy Information Technology

The Shared Responsibility Model (SRM) explains that cloud customers and service providers each have their own security duties. Customers need to understand their roles to prevent most data breaches, which are often due to customer mistakes.
Google Cloud introduced the idea of 'Shared Fate,' encouraging cloud providers to take an active role in helping customers secure their environments. This shift acknowledges that both sides must work together for better security outcomes.
There are growing concerns about the risks of relying on a few major cloud providers. If one suffers a security issue, it can affect everyone, highlighting the need for a community approach to cloud security and trust.

The Trojan Kid

Technically Optimistic • 39 implied HN points • 08 Dec 23

🕹 Technology Online safety Legislation Privacy Social media Data Privacy

The Kids Online Safety Act aims to protect children online, give parents more control, and hold big tech accountable by age-gating, granting parents access to social media content, and imposing a duty of care on platforms.
Legislation targeting teen mental health should consider various factors beyond social media impacts, such as economic insecurity, gun violence, and climate change.
Proposed tech regulations like age verification may have unintended consequences, such as creating barriers for certain communities and endorsing authoritarian parenting styles.

Disservice Providers

Technically Optimistic • 19 implied HN points • 15 Mar 24

🕹 Technology Data Privacy Encryption AI Regulations

Social media platforms like Facebook and Instagram are businesses designed to make money, so they may track your data for profit.
Internet service providers (ISPs) like Comcast and Verizon bundle and sell your personal data, including sensitive information, potentially compromising your privacy.
Protect your data by adjusting your privacy settings, using encryption methods like SSL, and being aware of how companies handle your information online.

Must Learn AI Security Compendium 16: Shadow AI

Rod’s Blog • 39 implied HN points • 29 Nov 23

🕹 Technology AI Security Generative AI Data Privacy Ethical AI Governance

Shadow AI can expose organizations to risks like data leakage, model poisoning, unethical outcomes, and lack of accountability.
To address shadow AI risks, organizations should establish a clear vision, encourage collaboration, implement robust governance, follow responsible AI principles, and regularly monitor AI systems.
Adopting a responsible and strategic approach to generative AI can help organizations leverage its benefits while minimizing the risks associated with shadow AI.

And AI took that personally

networked • 215 implied HN points • 22 Mar 23

🕹 Technology Artificial Intelligence Machine Learning Open Source Data Privacy Technology Industry

Artificial intelligence is the revolutionary technology that crypto tried and failed to be.
Many of today's popular AI products are effectively loss leaders, not fully-fledged solutions.
AI will often be mindlessly stapled onto legacy formats, creating unoriginal implementations.

Highly Sensitive

Technically Optimistic • 19 implied HN points • 01 Mar 24

🇺🇸 U.S. Politics Data Privacy Legislation AI Technology National Security

President Biden's Executive Order aims to protect Americans' sensitive data from being transferred to 'countries of concern' like China and Russia.
Legislation for data privacy in the US needs to address not just foreign threats but also prevent data collection within the country, like in cases of apps like TikTok.
Comprehensive data privacy laws are crucial, and while the Executive Order is a positive step, there is a need to push for more robust protection measures from legislators.

Graphlan: interpersonal computing for the AI age

Graphlan’s Substack • 19 implied HN points • 01 Mar 24

🕹 Technology AI Relationships Data Privacy

Many people today feel disconnected from their human relationships compared to previous years, with technological advances often making things worse.
Graphlan aims to be a platform that helps people seek out and nurture more meaningful human connections, by providing tools for different types of relationships.
Developers and partners are sought to help shape and build apps on the Graphlan network, which focuses on facilitating genuine, thoughtful interactions between individuals.

T-RAG = RAG + Fine-Tuning + Entity Detection

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 15 Feb 24

🕹 Technology AI LLMs Data Privacy Software Development Machine Learning

T-RAG is a method that combines RAG architecture with fine-tuned language models and an entity detection system for better information retrieval. This approach helps in answering questions more accurately by focusing on relevant context.
Data privacy is crucial when using language models for sensitive documents, so it's better to use open-source models that can be hosted on-premise instead of public APIs. This helps prevent any risk of leaking private information.
The model uses an entities tree to improve context when processing queries, ensuring relevant entity information is included in the responses. This makes the answers more useful and comprehensive for the user.

Must Learn AI Security Part 22: Machine Learning Attacks Against AI

Rod’s Blog • 39 implied HN points • 18 Oct 23

🕹 Technology AI Security Machine Learning Cybersecurity Data Privacy Model Training

Machine Learning attacks against AI exploit vulnerabilities in AI systems to manipulate outcomes or gain unauthorized access.
Common types of Machine Learning attacks include adversarial attacks, data poisoning, model inversion, evasion attacks, model stealing, membership inference attacks, and backdoor attacks.
Mitigating ML attacks involves robust model training, data validation, model monitoring, secure ML pipelines, defense-in-depth, model interpretability, collaboration, regular audits, and monitoring performance, data, behavior, outputs, logs, network activity, infrastructure, and setting up alerts.

The latest papers about browser fingerpinting

The Web Scraping Club • 19 implied HN points • 11 Feb 24

🕹 Technology Cybersecurity Data Privacy

Browser fingerprinting is used as an alternative to cookies and raises privacy concerns due to its unique identification capabilities.
Desktop devices are more easily uniquely fingerprinted compared to mobile devices, with Chrome providing more detailed configurations.
Innovative approaches like using WebGPU for web fingerprinting pose privacy risks and may require countermeasures to prevent misuse.

Anatomy of a Hack

Am I Stronger Yet? • 47 implied HN points • 25 Jan 24

🕹 Technology Cybersecurity AI Web Development Hackers Data Privacy

Complicated systems are vulnerable to hacks
Real-world hacks involve stringing together various loopholes
Security in complex systems is challenging; more complex systems have more potential security issues

death of reality.

a quest for knowledge • 39 implied HN points • 18 Mar 23

🕹 Technology Deepfakes Data Privacy AI Tools

We are entering an era of deep fakes and manipulated content.
Our digital world is evolving into a fake version of reality.
The fusion of digital and physical worlds is making our online identities more crucial.

COVID-19 digital contact tracing worked - heed the lessons

Digital Epidemiology • 39 implied HN points • 06 Jul 23

🏥 Health & Wellness Digital Health Public Health Data Privacy Pandemic response Technology development

Many European governments were not interested in privacy-preserving digital contact tracing.
Digital contact tracing showed that privacy preservation and fighting a pandemic can go hand in hand.
There is a lack of investment in digital contact tracing technology despite its potential benefits.

Governing by apocalypse

Antimaterie • 39 implied HN points • 31 May 23

🕹 Technology AI Information Control Manipulation Social Impact Data Privacy

The fear of AI wiping out humanity is being used as a scare tactic by elites to gain control of the field.
Governments are worried about losing control as individuals gain access to vast knowledge through AI applications.
The power of AI to extract knowledge from information poses a threat to established narratives and information control by governments and elites.

Brief: How Microsoft AI Protects Your Data Privacy

Rod’s Blog • 19 implied HN points • 07 Feb 24

🕹 Technology AI Data Privacy Microsoft

Microsoft AI is based on the principle of 'your data is your data', emphasizing that you own and control your personal data.
Microsoft AI ensures data privacy by collecting and using data with consent, not selling data to third parties, and implementing strong security measures.
Data privacy is crucial for AI as it builds trust, protects human rights and promotes innovation in the industry.