rounds.

Methodology & Transparency

How rounds. scores, classifies, and summarizes medical research for your daily digest.

We believe physicians deserve full transparency on how AI tools work. This page documents every step of our pipeline — from article discovery to the digest you read each morning.

How We Score Articles

Every article in your feed receives a relevance score from 0 to 100. This score is computed from four weighted factors, calibrated through physician feedback:

Semantic Similarity

45%

How closely an article matches your specialty, sub-interests, and reading history — measured via embedding similarity between paper abstracts and your personalization profile.

Evidence Quality

30%

Combines evidence level (L1-L5), study design, bias flag count, sample size, and a composite quality score (0-1). Higher-quality evidence is weighted more heavily.

Recency

15%

More recent publications receive a higher score. A time-decay function ensures your digest surfaces the latest findings without ignoring important older work.

Novelty

10%

Articles covering topics you haven't seen recently score higher. This prevents your feed from becoming an echo chamber and surfaces emerging research areas.

Evidence Quality Classification

Each article is automatically classified according to a 5-level evidence hierarchy, adapted from the Oxford Centre for Evidence-Based Medicine framework:

Level 1

RCTs & Meta-Analyses

Systematic reviews of RCTs and individual randomized controlled trials. The highest standard of clinical evidence.

Level 2

Prospective Studies

Systematic reviews of cohort studies and individual prospective cohort studies with good methodology.

Level 3

Retrospective Studies

Case-control studies and retrospective analyses. Useful for hypothesis generation but prone to selection bias.

Level 4

Case Series

Case series and case reports without controls. Limited generalizability but valuable for rare conditions.

Level 5

Expert Opinion

Expert opinion, mechanism-based reasoning, and bench research without direct clinical validation.

Bias Detection

Our pipeline checks for 18 canonical bias flags including methodological limitations, conflicts of interest, and statistical concerns. Articles with 3+ flags and low evidence level are flagged as low quality.

Small sample sizeNo blindingNo control groupSelf-reported outcomesSingle centerShort follow-upIndustry fundedHigh attritionNo intention-to-treat+9 more

AI Summarization

Summaries are generated by GPT-4o with a structured medical prompt that extracts:

Core message

What does this paper conclude?

Strengths

Methodological strengths of the study

Weaknesses

Limitations and caveats to consider

Impact on practice

How this could affect your clinical work

Action items

Concrete next steps if findings are validated

Relevance score

0-100 personalized relevance to your niche

Important limitations

  • AI summaries may contain factual errors or miss important nuances
  • Summaries should never replace reading the original publication
  • AI cannot assess clinical applicability to individual patients
  • Preprint summaries may reflect findings that change after peer review

Personalization

Your digest improves over time through a feedback-driven personalization system:

Cold-start profile

During onboarding, your specialty, sub-interests, and clinical context are converted into a semantic embedding that serves as your initial preference profile.

Adaptive learning

Each time you rate an article as relevant or not relevant, your profile is updated using an exponentially weighted moving average (EWMA). Positive signals are weighted more strongly to prevent profile drift from noise.

Drift guard

A niche anchor — based on your original specialty declaration — prevents your profile from drifting too far from your core expertise, even after many ratings.

Data Sources

PubMed / MEDLINE

Peer-reviewed

Comprehensive index of biomedical and life science literature. Articles are peer-reviewed before inclusion.

arXiv

Preprint

Open-access preprints in quantitative biology, computer science, and statistics. Not peer-reviewed.

medRxiv

Preprint

Health sciences preprints. Screened for scope but not peer-reviewed. Findings should be interpreted with caution.

bioRxiv

Preprint

Biology preprints covering all life sciences. Not peer-reviewed. May include preliminary or incomplete findings.

What We Don't Do

  • No clinical decision support. rounds. does not provide treatment recommendations or diagnostic suggestions.
  • No drug interaction checks. We summarize research findings but do not evaluate interactions or contraindications.
  • No patient-specific advice. Summaries are generic and cannot account for individual patient circumstances.
  • No replacement for peer review. Our evidence flags are automated estimates, not expert peer review.

For our full medical disclaimer, see Medical Disclaimer.

Questions about our methodology?

We welcome feedback from the medical community.

Get in touch
Feedback