The hottest Research Methods Substack posts right now

And their main takeaways
Category
Top Science Topics
Cremieux Recueil • 477 implied HN points • 25 Mar 26
  1. Researchers often use between-person comparisons that aren’t causally informative even when within-person or sibling designs are possible, so their estimates can be biased by unmeasured confounders.
  2. When you run within-family or within-person analyses, many headline associations (for example, claims that more social media use lowers cognition) disappear, suggesting those original results were artifacts of confounding.
  3. The field routinely skips basic robustness checks and measurement-invariance tests; empowering methodologists, providing better tools, and enforcing stricter editorial standards would greatly improve research reliability.
Experimental History • 21198 implied HN points • 17 Feb 26
  1. Many famous psychology and neuroscience findings are under fresh scrutiny because of shady methods, tiny samples, or failed replications, so canonical stories aren’t as solid as they once seemed.
  2. How researchers measure things matters a lot — using correlation versus absolute error can lead to opposite conclusions about whether people understand how public opinion has changed.
  3. A bunch of curious, practical items matter too: interviews, art and career advice, puzzles and internet myths show the value of digging deeper, and a few vocal individuals often dominate complaint systems and waste resources.
Knowingless • 1566 implied HN points • 12 Mar 26
  1. Scales are groups of survey items found with factor analysis that let you measure hidden traits efficiently, but they need lots of questions and many respondents to be reliable, and metrics like Cronbach’s alpha can be gamed by redundant items.
  2. Which items you include strongly shapes what factors you find, so a narrow or biased question set will miss whole traits; crowdsourcing a huge swath of questions can reveal unexpected dimensions but doesn’t eliminate sampling or submission bias.
  3. When you open up question-space widely, the biggest stable dimensions that tend to pop out are political left–right, belief/mysticism versus rationality, and a happy-versus-sad emotional axis, with many smaller subfactors depending on how finely you break the data.
@adlrocha Weekly Newsletter • 64 implied HN points • 13 Mar 26
  1. A simple edit-evaluate-keep loop lets autonomous agents run short experiments and find real improvements by iterating quickly on a single editable training file and a fast proxy metric like validation bits-per-byte.
  2. Many small agents running on varied hardware can share discoveries via gossip protocols and turn idle or distributed GPUs into a decentralized research swarm that accelerates optimizations collectively.
  3. Picking the right evaluation and reward function is the hard part—designing clean, fast proxies and constraints (research taste) will matter more than raw execution in many fields, especially where feedback is slow or noisy.
Subconscious • 1146 implied HN points • 25 Feb 26
  1. Fold context by running separate agent threads on different sources, saving each thread's summary, and then merging those summaries into a synthesized solution — this divergence-then-convergence workflow yields much better results.
  2. Problems need enough variety to be solved. LLMs have huge latent variety that RLHF often narrows, so you can restore useful, surprising behavior by steering models with context windows, tools, and divergent multi-agent exploration.
  3. Save the summaries as compressed artifacts for reuse and run multiple passes (research then development) to both explore and refine ideas, and be willing to give up some control so agents can surface novel, meaningful options.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Knowingless • 4389 implied HN points • 05 Feb 26
  1. A lot of top fetish/kink surveys use small or convenience/targeted samples and often lack full anonymity, which makes them prone to selection bias and limits how much we can trust their conclusions.
  2. Very large internet surveys, even if imperfect, can outperform many academic studies on sample size and breadth and often replicate known psychological patterns, making them valuable for studying relationships even if raw base-rate estimates are shaky.
  3. Structural problems—publication incentives, peer-review politics, restrictive IRBs, and uneven statistical skill—are major reasons the field stays small-scale and cautious instead of improving methods and collecting bigger, better data.
Cremieux Recueil • 295 implied HN points • 13 Mar 26
  1. Researchers often split samples and hunt for subgroups where effects become significant, but reporting subgroup "wins" without testing interactions or accounting for low power produces misleading, likely fluke results.
  2. The functional medicine trial example shows clear red flags: inconsistent numbers, bad or post-hoc preregistration, incorrect power/sample-size math, undisclosed conflicts, non-ITT analyses, and unreported/misused subgroup tests with weak measures.
  3. These practices make findings fragile and hard to replicate, so studies need proper prospective registration, correct power calculations, transparent reporting (including interaction tests), multiple-comparisons control, and shared data to be trustworthy.
Why is this interesting? • 1447 implied HN points • 03 Feb 26
  1. Many major artificial sweeteners were found by accident when people in labs tasted or otherwise noticed unexpected sweetness from spilled or handled chemicals.
  2. Human senses, especially taste, act as extremely sensitive high-throughput detectors and can spot potent effects that controlled screenings often miss.
  3. Accidental discoveries can beat deliberate testing in impact, but safety matters—breakthroughs from exposure to the unknown should never justify reckless lab behavior.
Heterodox STEM • 348 implied HN points • 22 Feb 26
  1. Using 'many scientists believe' as proof is not the same as presenting hard evidence, and for issues like whether the polar jet stream is weakening the clear observational data is limited or inconclusive.
  2. Much climate reasoning depends on open-loop computer models that aren't validated the way engineering models are. Funding and media incentives can push scientists to emphasize more alarming model results.
  3. Political and funding pressures can distort scientific priorities and public messaging, so consensus and authority shouldn't replace testable evidence. Real scientific progress often overturns majority views, so skepticism and empirical testing must stay central.
DYNOMIGHT INTERNET NEWSLETTER • 703 implied HN points • 05 Feb 26
  1. If you measure lifespan heritability in a simulated world with no non‑aging deaths (accidents, murder, overdoses, infectious disease), the apparent heritability rises to roughly 46–57%, about 50%.
  2. Heritability is an observational ratio that depends on societal and environmental factors, so lowering extrinsic mortality naturally increases the fraction of lifespan variation attributed to genetics.
  3. The simulation is a useful exercise and matches historical twin estimates, but its strong assumptions and vague reporting mean the ~50% figure shouldn’t be taken as the true modern heritability; a more cautious read of the results suggests something closer to 35–45% (around 40%).
ASeq Newsletter • 14 implied HN points • 18 Mar 26
  1. Sarmal is a new company working on DNA sequencing and is pitching a technology called FLASH.
  2. FLASH stands for Fluorescence Activation by Serial Hybridization and is described as involving a polymerase, but the explanation and figure are unclear.
  3. There is a patent for the technology, and deeper details are gated behind a paid subscription paywall.
In My Tribe • 303 implied HN points • 07 Feb 26
  1. Personality traits only nudge the odds; the situation and the people around someone usually explain behavior better than fixed ā€œtypesā€ do.
  2. Successful builders often show persistence, agency, and resilience, but survivorship bias means sticking with something doesn’t guarantee success for most people.
  3. The path from genes to personality to behavior is messy, so genetic predictors are weak and experiences, relationships, and context matter a lot.
Astral Codex Ten • 11287 implied HN points • 11 Jul 25
  1. The structure of scientific papers can create a misleading impression of how research actually happens. Often, real research involves lots of trial and error, not just a straight path from question to answer.
  2. The amyloid cascade hypothesis, which suggests that amyloid plaques in the brain cause Alzheimer's, has been heavily focused on, but recent studies suggest it might not be the whole story. This has led to wasted research and funding on treatments that may not work.
  3. When reading scientific papers, it's important to think critically and not just accept the conclusions presented. Questions about what is missing or what alternative explanations exist can reveal more about the validity of the research.
Cremieux Recueil • 235 implied HN points • 23 Feb 26
  1. Many reported Flynn and anti-Flynn effects are driven by measurement bias—tests change meaning across cohorts and norms get obsolete—so gains often reflect test-taking sophistication more than real changes in general ability.
  2. Some apparent cohort trends are actually sampling or compositional artifacts, for example later-born children tending to have more advantaged parents, and those apparent gains or losses often disappear in within-family (sibling) comparisons.
  3. Robust conclusions require checking measurement invariance, using within-family designs, and guarding against collinearity and low power; when those methods are applied, large population IQ shifts usually shrink or vanish.
After Babel • 448 implied HN points • 05 Feb 26
  1. A free, research-informed toolkit gives schools ready-made surveys and measures to track how phone policies affect students, teachers, administrators, and parents.
  2. It works for both single-school evaluations and large, rigorous studies—Qualtrics formats and optional collaboration with the Stanford Social Media Lab support longitudinal tracking and advanced analysis.
  3. The toolkit adds practical analysis help (a manual scoring guide, a customizable survey builder, and a coming Data Dashboard), but it doesn’t by itself establish definitive causality without stronger study designs.
The Infinitesimal • 719 implied HN points • 09 Aug 24
  1. Twin heritability models can produce different estimates of how much traits are influenced by genetics versus environment. This can lead to confusion about what is truly inherited and what is shaped by upbringing.
  2. Cultural factors along with genetic factors play a significant role in shaping traits. Sometimes, what seems genetic can actually be environmental influences like parenting styles, which complicate our understanding of inheritance.
  3. Recent studies suggest that assumptions made in traditional twin studies might not be entirely accurate. By including more family relationships and considering cultural impacts, researchers can get a clearer picture of what really contributes to traits.
Experimental History • 17893 implied HN points • 04 Feb 25
  1. There are two types of problems: weak-link problems, where the overall quality depends on the weakest part, and strong-link problems, where the best part matters most. Understanding this helps us solve issues better.
  2. Science is often treated like a weak-link problem, focusing on stopping bad research rather than promoting great ideas. This approach can hold back progress in scientific discovery.
  3. To improve science, we should shift our mindset to supporting strong ideas and innovative research. This means caring less about keeping out the bad and more about encouraging the good.
Cremieux Recueil • 277 implied HN points • 13 Feb 26
  1. Changing test scoring to reward calibrated confidence and risk behavior instead of just right-or-wrong answers can make women appear smarter even though it measures a different thing.
  2. Including metacognitive calibration, confidence, and risk preference in an intelligence score mixes non-intelligence traits into the measure and can break the usual positive correlations across cognitive tests, producing misleading factor patterns.
  3. The correct way to compare sexes on intelligence is to use a large, diverse test battery, score accuracy normally, and compare the general intelligence factor; redefining intelligence without strong justification is not acceptable.
Asimov Press • 245 implied HN points • 12 Feb 26
  1. A simple motorized device called the vortex mixer uses a rubber cup and tight orbital motion to create a vortex that quickly mixes liquids in tubes and small vessels.
  2. The inventors combined technical skill and business savvy to prototype, patent, and commercialize the mixer, then improved it with features like touch activation, speed control, and multi-tube heads.
  3. Vortex mixers made mixing faster, cleaner, and less prone to contamination, becoming a ubiquitous and essential tool in modern biology labs.
Res Obscura • 15240 implied HN points • 22 Jan 25
  1. AI models are getting really good at history, especially in specific areas. They can help with tasks like translating old texts and offering historical context.
  2. While some people worry that AI tools lead to cheating in education, they can also enhance research efficiency. They help researchers to gather information and insights quickly.
  3. Despite AI's advancements, human creativity and understanding are still irreplaceable. There's a recognition that the unique human experience and thoughts are valuable and cannot be fully replicated by AI.
Astral Codex Ten • 6194 implied HN points • 03 Jul 25
  1. Genetic and environmental interactions matter a lot in understanding traits. Some traits are influenced by how genes work together with the environment, which makes it tricky to measure their heritability accurately.
  2. Using genetic scores from one population in another can lead to incorrect conclusions about intelligence differences. This happens because different groups might have different gene structures affecting traits, leading to wrong assumptions about genetic causes of observed differences.
  3. Research methods like twin studies and adoption studies can show different heritability estimates. It's important to carefully consider the assumptions behind these studies, as biases can impact results significantly.
In My Tribe • 288 implied HN points • 12 Jan 26
  1. Many psychological findings fail to replicate, which suggests the field needs stronger methods and that folk intuitions can make it hard to tell scientific results from guesswork.
  2. Because many genes affect many traits and behavior emerges from complex gene–environment interactions, predicting disorders or specific traits from genetics is very difficult, and turning continuous traits into binary diagnoses makes the statistics less reliable.
  3. Evolutionary ideas often explain common tendencies in politics and behavior, but they are not strict rules—social institutions, personality differences, and policy choices can amplify, reduce, or reverse those tendencies.
Experimental History • 11606 implied HN points • 23 Oct 24
  1. Democrats and Republicans misunderstand each other, but both sides can convincingly mimic each other's views. This shows they actually have a better grasp of each other's beliefs than they think.
  2. In a study, both parties struggled to differentiate between real and fake statements from their opponents, suggesting they might not truly know the depth of each other's perspectives.
  3. The findings imply that political disagreements might be REAL differences, not just simple misunderstandings, challenging the idea that better communication could solve everything.
Cremieux Recueil • 434 implied HN points • 27 Dec 25
  1. Make sure your criticism is correct: check the data, run the needed analyses, and only accuse or declare problems when you can justify them.
  2. Focus on meaningful, relevant issues that actually change conclusions — don’t list hypotheticals; quantify or demonstrate how a confound or error would affect the results.
  3. Be generous and contextual: assume good faith, ask for clarification or contact authors privately when fixable, and build enough domain knowledge to notice real problems instead of relying on rote one‑liners.
Unsafe Science • 152 implied HN points • 26 Jan 26
  1. AI tools can do careful, time-consuming critical reviews in minutes instead of days, making it possible to audit many papers quickly.
  2. Much microaggression research relies on self-reports, treats perceptions as objective facts, overstates causation from correlational data, and often uses circular logic that makes the claims hard to falsify.
  3. Scaling AI-driven critique could expose biased or low-quality scholarship and improve accountability, but its findings need human verification and there are real risks when criticism is dismissed as racism to avoid scrutiny.
Lever • 19 implied HN points • 16 Oct 24
  1. Bruce Wittmann's journey in science started from pre-med and led him to research at notable institutes like Caltech.
  2. He worked on machine learning to improve protein engineering, building tools that can help many people in the field.
  3. His collaboration with renowned scientists and contributions to published research highlight the exciting potential in protein design and computational biology.
AI Snake Oil • 2298 implied HN points • 16 Jul 25
  1. AI might actually slow down scientific progress instead of speeding it up. Even with more papers being published, true advancements in science could be stuck or even going backward.
  2. The more papers people publish, the harder it is for truly groundbreaking ideas to get noticed. This makes it tough for new and unique research to break through amidst all the noise.
  3. Scientists need to focus on understanding rather than just finding quick solutions. If AI is used to bypass understanding, we risk getting stuck with incorrect theories for longer.
Unsafe Science • 119 implied HN points • 29 Jan 26
  1. AI can be used to spot propaganda disguised as academic scholarship, doing in minutes what can take humans days and making large-scale checks possible.
  2. Some academic work is ideologically driven and can selectively cite or spin evidence, so claims (like widespread hiring bias) sometimes don’t match the actual data.
  3. Exposing propaganda often triggers hostile reactions from its defenders, which can signal the exposure is hitting a nerve, and automating the work with AI would make such critique faster and broader.
Living Fossils • 2 implied HN points • 11 Mar 26
  1. Many famous effects in psychology, like social priming and strong birth-order personality claims, don’t replicate well and are often statistical flukes or very weak.
  2. Boosting self-esteem doesn’t reliably cause better achievement; usually success and competence lead to higher self-esteem instead.
  3. Popular explanations like ā€œemotional intelligenceā€ or simple chemical‑imbalance models of mental illness are vague or unsupported, with poor measurement and limited predictive power, so we still don’t really know the causes of most mental disorders.
Never Met a Science • 188 implied HN points • 15 Jan 26
  1. AI is now powerful enough to reshape how research is produced, and academic institutions must adapt quickly or be overwhelmed by a flood of AI-assisted work.
  2. AI offers clear benefits like automated replication and more frequent updating of knowledge, but we need institutional safeguards about ownership, verification, and corporate control of the tools.
  3. The role of scholars should shift toward curating and filtering knowledge and maintaining deep expertise, supported by metascientific reforms that preserve epistemic authority and make inductive approaches credible.
Unsafe Science • 79 implied HN points • 02 Feb 26
  1. Many microaggression studies rely on correlational, nonexperimental data but still claim causal relationships between racism, microaggressions, and outcomes.
  2. Concluding that microaggressions cause negative health or mental-health impacts from simple correlations is not justified without stronger causal evidence.
  3. Peer review has often failed to catch these methodological flaws, allowing unsupported causal claims to persist in the literature.
Pekingnology • 139 implied HN points • 16 Jan 26
  1. China studies is drifting away from language skills, fieldwork, and primary sources, so much research is disconnected from the lived experience and context inside China.
  2. Many younger researchers approach China with vigilance and a competition mindset instead of curiosity, which biases questions and pushes attention-grabbing policy claims over balanced understanding.
  3. There is an unhealthy methodological imbalance—heavy reliance on quantitative models, overly narrow specialties, or vague grand-policy talk without historical and cultural grounding—leading to shallow analysis that can worsen mutual distrust.
Wyclif's Dust • 1609 implied HN points • 05 Jun 25
  1. Scientism can happen when researchers make general claims about science without considering the limits of their studies. It's important for scientists to recognize when their findings may not apply broadly.
  2. Social scientists often use big concepts that sound scientific, but they sometimes fail to acknowledge the unique context of their studies. This can lead to misleading conclusions about complex issues.
  3. The way some researchers present their findings may resemble 'cargo cult science,' where they follow scientific methods superficially but miss the deeper understanding needed for true insights. It's essential to connect the rigor of research with the actual realities of the world.
Elizabeth Laraki • 419 implied HN points • 28 May 24
  1. Kerry Rodden, a UX researcher, helped YouTube understand how users navigated the site. By deeply analyzing user data, they found out what people really wanted from YouTube.
  2. One big surprise was that most YouTube sessions didn't start on the homepage. Instead, many users went directly to watch videos they found elsewhere on the internet.
  3. Kerry created clear visualizations of user data that showed how people moved through YouTube. This helped the company improve its homepage and focus on personalizing content for users.
Heterodox STEM • 298 implied HN points • 30 Nov 25
  1. A major critique is that some immigration research adds little original empirical or theoretical insight and omits important peer‑reviewed studies that directly bear on its claims.
  2. The common measure of "generalized social trust" used to link trust and economic growth is argued to be flawed — problems include questionable survey validity, weak prediction of real trusting behavior, sample bias, omitted variables, and a lack of incorporation into formal growth models; when addressed, the purported trust–growth relationship can vanish.
  3. Scholarly disputes are criticized for relying on vague accusations, deleted public comments, and a failure to make specific, formal challenges to peers or journal editors, highlighting a need for clearer, evidence‑based engagement.
Brad DeLong's Grasping Reality • 199 implied HN points • 13 Dec 25
  1. Universities must earn public trust by being institutionally trustworthy: fix internal monocultures and focus teaching on real, demonstrable skills that give students access to useful knowledge.
  2. The true ā€˜super‑intelligence’ is the five‑millennia corpus of human ideas, and modern text‑processing systems are valuable mainly as translators or front ends to curated knowledge rather than infallible oracles.
  3. Education should train people to connect to, interpret, and extend the collective human mind by teaching durable methods, literacies, Popperian testability, and epistemic humility while updating practical skills for new media.
Unsafe Science • 97 implied HN points • 10 Jan 26
  1. Claims about widespread unconscious bias and pervasive anti‑female hiring discrimination are often overstated; measures like the IAT tap associations in memory rather than proven unconscious prejudice and do not reliably predict discriminatory behavior.
  2. Many DEI and anti‑bias trainings lack solid evidence that they change real‑world behavior and can have unintended costs or even provoke reverse bias, so interventions should be rigorously evaluated for both benefits and harms.
  3. The best practical approach is to focus like a laser on merit by using clear, job‑relevant criteria and individualized evidence, and to improve credibility through adversarial collaboration and honest communication about uncertainty.
The Good Science Project • 167 implied HN points • 23 Dec 25
  1. Metascience needs a clear micro vs. macro distinction: micro focuses on individual scientists’ beliefs, trust, and behaviors, while macro covers institutions, funding, and governance.
  2. Reforms often fail when they operate at only one level because individuals respond to incentives in predictable ways, producing unintended outcomes like gaming rules or self‑censoring risky work.
  3. Fixing science requires a full‑stack approach that designs policies to change both institutional incentives and the everyday experience of researchers, accounting for the feedback loops between the two.