Methodology & Transparency
How rounds. scores, classifies, and summarizes medical research for your daily digest.
We believe physicians deserve full transparency on how AI tools work. This page documents every step of our pipeline — from article discovery to the digest you read each morning.
How We Score Articles
Every article in your feed receives a relevance score from 0 to 100. This score is computed from four weighted factors, calibrated through physician feedback:
Semantic Similarity
45%How closely an article matches your specialty, sub-interests, and reading history — measured via embedding similarity between paper abstracts and your personalization profile.
Evidence Quality
30%Combines evidence level (L1-L5), study design, bias flag count, sample size, and a composite quality score (0-1). Higher-quality evidence is weighted more heavily.
Recency
15%More recent publications receive a higher score. A time-decay function ensures your digest surfaces the latest findings without ignoring important older work.
Novelty
10%Articles covering topics you haven't seen recently score higher. This prevents your feed from becoming an echo chamber and surfaces emerging research areas.
Evidence Quality Classification
Each article is automatically classified according to a 5-level evidence hierarchy, adapted from the Oxford Centre for Evidence-Based Medicine framework:
RCTs & Meta-Analyses
Systematic reviews of RCTs and individual randomized controlled trials. The highest standard of clinical evidence.
Prospective Studies
Systematic reviews of cohort studies and individual prospective cohort studies with good methodology.
Retrospective Studies
Case-control studies and retrospective analyses. Useful for hypothesis generation but prone to selection bias.
Case Series
Case series and case reports without controls. Limited generalizability but valuable for rare conditions.
Expert Opinion
Expert opinion, mechanism-based reasoning, and bench research without direct clinical validation.
Bias Detection
Our pipeline checks for 18 canonical bias flags including methodological limitations, conflicts of interest, and statistical concerns. Articles with 3+ flags and low evidence level are flagged as low quality.
AI Summarization
Summaries are generated by GPT-4o with a structured medical prompt that extracts:
Core message
What does this paper conclude?
Strengths
Methodological strengths of the study
Weaknesses
Limitations and caveats to consider
Impact on practice
How this could affect your clinical work
Action items
Concrete next steps if findings are validated
Relevance score
0-100 personalized relevance to your niche
Important limitations
- AI summaries may contain factual errors or miss important nuances
- Summaries should never replace reading the original publication
- AI cannot assess clinical applicability to individual patients
- Preprint summaries may reflect findings that change after peer review
Personalization
Your digest improves over time through a feedback-driven personalization system:
Cold-start profile
During onboarding, your specialty, sub-interests, and clinical context are converted into a semantic embedding that serves as your initial preference profile.
Adaptive learning
Each time you rate an article as relevant or not relevant, your profile is updated using an exponentially weighted moving average (EWMA). Positive signals are weighted more strongly to prevent profile drift from noise.
Drift guard
A niche anchor — based on your original specialty declaration — prevents your profile from drifting too far from your core expertise, even after many ratings.
Data Sources
PubMed / MEDLINE
Peer-reviewedComprehensive index of biomedical and life science literature. Articles are peer-reviewed before inclusion.
arXiv
PreprintOpen-access preprints in quantitative biology, computer science, and statistics. Not peer-reviewed.
medRxiv
PreprintHealth sciences preprints. Screened for scope but not peer-reviewed. Findings should be interpreted with caution.
bioRxiv
PreprintBiology preprints covering all life sciences. Not peer-reviewed. May include preliminary or incomplete findings.
What We Don't Do
- No clinical decision support. rounds. does not provide treatment recommendations or diagnostic suggestions.
- No drug interaction checks. We summarize research findings but do not evaluate interactions or contraindications.
- No patient-specific advice. Summaries are generic and cannot account for individual patient circumstances.
- No replacement for peer review. Our evidence flags are automated estimates, not expert peer review.
For our full medical disclaimer, see Medical Disclaimer.