Critical Care · 2023Erasme University Hospital, Brussels

החוקר הדיגיטלי: Literature Review וכתיבה אקדמית עם AI
מסלול עבודה מלא של מחקר אקדמי עם AI - מ-PubMed search ועד manuscript: screening, extraction, drafting, ICMJE disclosure

Artificial Intelligence Hallucinations in Anaesthesia: Causes, Consequences and Countermeasures

✍️ Salvagno M, Taccone FS, Gerli AG

📋 תמונה רחבההחוקר הדיגיטלי: Literature Review וכתיבה אקדמית עם AI

🎯 בקצרה: על מה מדובר?

Deep Dive זה סוקר את מסלול העבודה המלא של מחקר אקדמי ב-2026 עם AI tools: • PubMed search • Screening (ASReview LAB v.2 - גרסה חדשה עם multi-agent, Rayyan - כלי screening שיתופי) • Data extraction (Elicit, Custom GPT - GPT מותאם אישית עם הנחיות והקבצים שלך) • Synthesis (NotebookLM) • Drafting (Claude/GPT) • Editing (Trinka, Paperpal) כל שלב חוסך 50-80% זמן. לפי AMA Physician AI Survey 2026 (מרץ 2026): 81% מהרופאים משתמשים ב-AI מקצועית - כפול מה-38% של 2023. 39% משתמשים ל-summaries of medical research (התחום הנפוץ ביותר). 35% ל-literature search. 28% ל-billing codes/charts. נכסה את הסוגיות האתיות הקריטיות: ICMJE (International Committee of Medical Journal Editors - הסטנדרט באקדמיה) עדכון ינואר 2026 הוסיף Section V ייעודי ל-AI ו-WAME (World Association of Medical Editors) - AI אינו יכול להיות מחבר, אך קיימת חובת disclosure (הצהרה - חשיפת שימוש ב-AI במאמר). detection tools (GPTZero, Originality.ai) אינם אמינים - ב-2026 הסטנדרט: trust + disclosure. נדגיש את המאמר המכונן של Salvagno (Critical Care 2023) על hallucinations (הזיה - המודל ממציא מידע) וטכניקות mitigation. לסיום - stack מומלץ של 40-60 דולר לחודש לדרמטולוג חוקר ישראלי, ו-bootcamp בן 4 שבועות להגעה לרמת power user. ROI: מעל 100 שעות חיסכון בשנה.

📊

81%

רופאים שמשתמשים ב-AI מקצועית (AMA 2026)

📚

39%

רופאים שמשתמשים ב-AI ל-summaries of research (AMA 2026)

⚡

80%

חיסכון בזמן ב-screening עם ASReview

⚠️

30-40%

Hallucinations בציטוטים של ChatGPT-3.5

🚫

מאמרי Nature/Science שמאפשרים AI authorship

🔄

שלבים בזרימת עבודה מלאה

💡 מה חייבים לזכור מהמאמר הזה

🔄מסלול עבודה של AI למחקר אקדמי משלב 5 שלבים: • PubMed search • Screening (ASReview - כלי open source ל-active learning ב-screening) • Data extraction (Custom GPT) • Synthesis (NotebookLM) • Drafting (Claude/GPT) כל שלב חוסך 50-80% זמן.

🚫ICMJE עדכון ינואר 2026 (Section V חדש) ו-WAME (Sept 2023, מתעדכן): AI אינו יכול להיות מחבר. הסיבה: מחבר נושא באחריות, AI אינו יכול לקחת אחריות. אך קיימת חובת disclosure על השימוש - בכל מאמר.

⚠️Salvagno 2023 (Critical Care): hallucinations הן הבעיה המרכזית של AI במחקר. 30-40% מה-citations של ChatGPT raw. הפתרון: RAG, verification, transparency.

🎯Detection tools (GPTZero, Originality.ai): false positive rate 5-15%. הרגישות יורדת ככל שהמודל משתפר. ב-2026 - אינם אמינים, ג'ורנלים מתבססים על trust + disclosure.

📚Reference managers ב-2026: Zotero (תוכנה חינמית לניהול ציטוטים - הכי פופולרית באקדמיה, עם AI plugins), Mendeley, EndNote. כולם הוסיפו AI features: auto-summarization, citation extraction, related papers.

✍️Vibe Writing עם AI: הסיכון הוא כתיבה generic. הפתרון: AI ל-drafting + structure, אדם ל-voice + insight + critical reading. AI הוא partner, אינו ghostwriter.

🔍 PubMed Search המתקדם - Beyond Boolean

PubMed נותר הסטנדרט למחקר biomedical. ב-2026 יש לו מעל 38 מיליון רשומות, וכ-1.5 מיליון מתווספות בשנה. אך החיפוש המתקדם דורש skill.

שלוש שיטות חיפוש:

•Basic search - מילים פשוטות. רחב, מהיר ויעיל אך עלול להחמיץ

•MeSH (Medical Subject Headings - מערכת מבוקרת של 30,000+ terms לסיווג מאמרים ב-PubMed) search - דיוק גבוה אך דורש היכרות עם ההיררכיה

•Field tags + Boolean (שילוב AND/OR/NOT לחיפוש מתקדם) - שילוב מתקדם

דוגמה ל-query מתקדם:

'((dupilumab[Title/Abstract] OR "anti IL-4"[Title/Abstract]) AND ("bullous pemphigoid"[MeSH] OR "BP"[Title/Abstract]) AND ("2020"[PDat] : "2026"[PDat]) AND ("randomized controlled trial"[Publication Type] OR "meta-analysis"[Publication Type])) NOT (review[Publication Type])'

PubMed AI features:

•'Best Match' (אלגוריתם ML של PubMed שמסדר תוצאות לפי relevance) - sorting אלגוריתמי, פותח 2018

•'Similar Articles'

•'Computed Author'

•'PubReMiner'

כלי AI חדשים המשולבים עם PubMed:

•PubMed BERT (Lee 2019, BioBERT)

•SciSpace

•Elicit

•Consensus

מסלול עבודה מתקדם:

•Pyramid approach - שלב 1 רחב, שלב 2 מסנן filters, שלב 3 deep dive

•PRISMA (הסטנדרט לדיווח על Systematic Review - סקירה שיטתית שמסכמת RCTs - Randomized Controlled Trials, ניסויים אקראיים מבוקרים) flow - תיעוד כל התהליך

•Saved searches + alerts ב-PubMed: יצירת alert על search query. כל מאמר חדש מתאים נשלח כ-email

•MyNCBI (חשבון חינם של NLM לניהול searches ו-collections) - ניהול collections

•RIS export - ייצוא ל-Zotero, EndNote, Mendeley

שלב 1 - רחב. דוגמה ל-query:

'((dupilumab[Title/Abstract] OR "anti IL-4"[Title/Abstract] OR "IL-4 receptor inhibitor"[Title/Abstract]) AND ("bullous pemphigoid"[MeSH Major Topic] OR "BP"[Title/Abstract]))'

תוצאה: 200 records

שלב 2 - filtering: הוספת '("2020"[PDat] : "2026"[PDat]) AND english[Language] AND humans[MeSH]'. תוצאה: 80 records

שלב 3 - publication types: הוספת '("randomized controlled trial"[Publication Type] OR "meta-analysis"[Publication Type])'. תוצאה: 25 records

PRISMA flow:

•Identified: 200 records

•After duplicates removed: 180

•After title/abstract screening: 100

•After full text screening: 25

•Eligible: 25, Included: 20

Reasons for exclusion:

•Not English: 10

•Not relevant: 40

•Not RCT/SR (Systematic Review): 105

תיעוד: PRISMA flow diagram + table of search strategy. חובה לפרסום SR.

PubMedPRISMAsearch

PubMed Best Match (פותח 2018, הושק 2019): ML algorithm שאומן על מיליארדי clicks של משתמשים ב-PubMed

האסטרטגיה: queries דומים -> אנשים הקליקו על paper X -> X relevant to similar queries

השוואה: Most Recent - מסדר לפי תאריך, גרוע כשמחפשים classic papers. Best Match - מסדר לפי relevance + citation count + date. ברירת המחדל ב-PubMed מ-2019

ניסיון: search 'psoriasis biologics'. Most Recent: papers from 2024-2025 only, miss seminal papers. Best Match: top 20 includes RCT defining trials (Adesa, EXTEND) from 2010-2020

מתי Most Recent עדיף: עדכון על נושא ידוע. 'מה החדש ב-X?'

המלצה: Best Match כברירת מחדל, Most Recent ל-recency-focused queries

best-matchMLPubMed

MyNCBI (חשבון חינמי של NLM): שמור searches ו-collections של papers. תכונה מרכזית: alerts

הזרימה: הקם search query מורכב. 'Create Alert' - הגדר תדירות (daily, weekly, monthly). פורמט: email עם links ל-papers חדשים שתואמים ל-query

דוגמה: alert על 'dupilumab AND psoriasis - 2024-2025'. כל שבוע, email עם 0-5 papers חדשים. אם יש - 5 דקות לסקור ולהחליט אם רלוונטי

רעיון מתקדם: 3-5 alerts על תחומים ספציפיים שמתעדכנים בהם. דוגמאות:

•psoriasis biologics

•BP treatment

•AI dermatology

•Mohs surgery margins

•Skin cancer AI

5 alerts בשבוע = 30-60 דקות update time במקום שעה+

alertsMyNCBIautomation

🔬 Screening - ASReview, Rayyan AI

Screening הוא השלב המסיבי ביותר ב-systematic review. SR טיפוסי בודק 1,000-10,000 abstracts על מנת למצוא 50-200 papers רלוונטיים. כל abstract: 30 שניות. 10,000 abstracts = 80+ שעות. כפול 2 (יותר מ-reviewer אחד) = 160 שעות.

הפתרון: AI-assisted screening. ASReview (van de Schoot et al, Nature Machine Intelligence 2021) הוא open source tool של אוניברסיטת Utrecht שמשתמש ב-active learning (שיטת ML: המודל בוחר את הדוגמאות הכי מועילות ללמוד).

ASReview LAB v.2 (פורסם ב-Patterns 2025): גרסה חדשה עם multi-agent screening (מספר reviewers על אותו AI model), שיפור ביצועים של 24.1% reduction in loss מול v.1, תמיכה ב-multilingual transformer models, ו-collaborative workflows.

הזרימה:

•Upload all abstracts (RIS file)

•Reviewer מסמן 5-10 papers seed (relevant) + 5-10 irrelevant

•ASReview מאתחל ML model

•הוא מציג abstracts ב-order של predicted relevance

•Reviewer סוקר במהירות

•המודל מתעדכן בכל סקירה

התוצאה: בדרך כלל קוראים 20-30% מ-abstracts ומוצאים 95%+ מ-relevant papers (Recall/Sensitivity - אחוז המאמרים הרלוונטיים שמצאת). חיסכון של 70-80%. van de Schoot 2021: בדק 6 SR datasets - 80% פחות screening time עם 95%+ recall.

כלים נוספים:

•Rayyan (כלי screening שיתופי - מפותח על ידי QCRI Qatar Computing Research Institute, founded 2016) - alternative פופולרי. גרסה חינמית, מאפשר collaborative screening (מספר reviewers). ב-2026 מציע 2 plans: Essential ו-Advanced, עם annual savings של עד 40%

•DistillerSR-AI - commercial tool של DistillerSR. יקר (מעל 10,000 דולר לפרויקט) אך מציע power-user features

•Covidence (הסטנדרט ב-Cochrane reviews - לא AI-driven, Cochrane partner). הוסיף AI ב-2024

8 שלבים:

•שלב 1: Export RIS מ-PubMed/Scopus/etc

•שלב 2: Install ASReview Lab (Python pip, או Docker). פתח ב-localhost:5000

•שלב 3: Create new project. Upload RIS file

•שלב 4: Setup - בחר model (default: Naive Bayes או Logistic Regression). 5 papers seed מסומנים relevant + 5 irrelevant

•שלב 5: Start screening. כל abstract מוצג, סמן relevant/irrelevant. המודל מתעדכן ברקע

•שלב 6: ה-prediction score של מה שלא קרוב יורד. כשה-confidence מצביע על כך שלא תמצא יותר - עצור. בדרך כלל לאחר 20-30% מה-abstracts

•שלב 7: Export of relevant papers (RIS)

•שלב 8: Continue with full-text review

תועלת: 1,000 abstracts ב-2-3 שעות במקום 12. validation: ASReview validation study הראה 95-100% recall ב-most cases

Caveats: AI אינו מושלם, יש 0-5% false negatives. למחקר high-stakes (Cochrane SR), reviewer שני אנושי ידני על subset (כדי לוודא שלא הוחמץ דבר)

ASReviewscreeningworkflow

Rayyan (free tier מספיק לרוב SR). פיצ'רים:

•Upload RIS, deduplication אוטומטית

•Multi-reviewer support - הזמן קולגה. כל אחד סוקר עצמאית, blind

•Conflict detection - כש-2 reviewers חולקים, שלישי מכריע

•AI assistance - מסמן papers similar למה שכבר included

•Inclusion/exclusion criteria - הגדר לפני screening

•Tagging - הוסף tags אישיים

•Export - RIS, CSV, custom report

תהליך טיפוסי: Reviewer 1 ו-2 סוקרים independently. Disagreements עוברים ל-Reviewer 3 (PI). התוצאה: high-quality screening עם consistency

Pricing 2026: 2 individual plans (Essential ו-Advanced). Annual savings עד 40%. Premium features: auto-resolving duplicates, mobile app, PICO extraction

רוב SR אקדמיים משתמשים ב-Rayyan או Covidence. Rayyan עדיף ל-team workflow, Covidence עומד ב-Cochrane standards

Rayyanteamcollaborative

ASReview, Rayyan, Covidence - כולם פותחו על תוכן אנגלי. Tokenization, ML models - אנגלית-centric

עברית? תיאורטית עובד אך לא נבדק לעומק. מאמרים בעברית (Harefuah, IMAJ): ASReview יעבוד אך פחות יעיל - המודל אינו מבין את הניואנס. Recall יכול לרדת ל-80% (מול 95%+ באנגלית)

עצה: לרוב SR בדרמטולוגיה - 99% מה-papers באנגלית. אם יש few Hebrew papers - hand-screen

למחקר שבו Hebrew papers הם chunk משמעותי:

•AI screening לאנגלית

•Manual screening לעברית

•Combined PRISMA flow

בעיה רחבה יותר: כלי AI כלליים פחות יעילים בעברית. RTL displays עובדים, ה-semantics סובלת. ב-2026 זה עדיין open issue

עבריתscreeninggap

📊 Data Extraction - Elicit, Custom GPT, Schema

לאחר שעברתם screening ויש 50-200 papers ל-include, השלב הבא הוא data extraction - הוצאת המידע המובנה מכל paper לטבלה.

מה לחלץ (לפי PICO - Population/Intervention/Comparator/Outcome - מבנה שאלת מחקר):

•Study design

•Sample size

•Intervention

•Comparator

•Outcomes

•Key results

•Limitations

•Funding source

בדרמטולוגיה ספציפית:

•PASI/SCORAD/PGA scores

•Time points

•AEs

•Drug doses

•Retention rates

extraction ידני: 30-60 דקות per paper. 100 papers = 50-100 שעות.

AI-assisted extraction:

•Elicit - כבר הסברנו

•DistillerSR-AI - יקר אך מקצועי

•GPT-5.5 ו-Claude Opus 4.7 (LLM - Large Language Model - מודל שפה גדול) - יכולים לחלץ data מ-PDFs

הזרימה: העלה PDF, prompt עם schema (תבנית מוגדרת לפלט), קבל JSON (פורמט מובנה לנתונים - שדות וערכים). דיוק: 80-90% עם אימות.

השוואת מתודות:

•Manual: 100% accuracy מקסימום, 50-100 שעות

•Single AI extraction: 80% accuracy, 5 שעות

•AI extraction + manual validation 10%: 95% accuracy, 8 שעות

ההמלצה: option 3 הוא ה-sweet spot. מחקרים משווים: מספר מחקרים ב-2024-2025 (J Med Internet Res ועוד) הראו ש-GPT-4/GPT-5 יכולים לחלץ data מ-clinical trial papers ב-accuracy של כ-85-90%.

שגיאות נפוצות:

•Errors בתאריכים

•מספרים בטבלאות (לא בטקסט)

•Nuanced outcomes (partial response שמוגדר אחרת בכל paper)

הקמה (30-60 דקות):

•שלב 1: ב-ChatGPT - Create Custom GPT, או ב-Claude.ai - Create Project

•שלב 2: System instructions: 'אתה medical research data extractor. למשתמש PDF של clinical trial. חלץ data ב-JSON עם schema הזה: paste schema. אם field לא ברור - unclear. אם missing - not reported. ציטוט page number לכל extraction'

•שלב 3: Knowledge base: upload 1-2 example papers extracted, להראות formatting הרצוי

•שלב 4: Conversation starters: 'Upload your RCT PDF and I'll extract structured data'

•שלב 5: בדוק על 3 papers ידועים - ודא דיוק. אם 90%+ accuracy - מוכן

שימוש שוטף: drag-drop PDF, מקבל JSON תוך 30 שניות. validate edge cases. paste ל-spreadsheet

ROI: 30-60 דקות הקמה -> 5-10 שעות חיסכון per project

Custom GPTClaude Projectextraction

שגיאות שכיחות:

•Numbers in tables - AI לעיתים אינו קורא טבלאות נכון. דוגמה: PASI-75 = 71% in table, AI extracted as 75% (mistook column). הפתרון: validate numbers בטבלאות specifically

•Dates - 'Recruitment 2018-2021' - AI לעיתים אומר 2018 בלבד או 2021 בלבד

•Nuanced outcomes - 'Complete response' definition משתנה. במחקר 1: 'no visible disease'. במחקר 2: 'PASI 100'. AI לעיתים מתעלם מהגדרה

•Missing data - 'Not reported' vs '0' vs 'unclear' - AI לעיתים מבלבל

•Subgroups - 'Effect in elderly was 60%, total 75%' - AI נצמד ל-total

•Drug names variations - 'Adalimumab biosimilar' vs 'Adalimumab' - לעיתים מאוחדים יחד

להדגיש: בכל validation, התרכז ב-numerical accuracy + outcomes definitions

errorsvalidationextraction

5 שלבים אוטומטיים:

•שלב 1: PDFs בתיקייה (לאחר screening). 50-200 PDFs

•שלב 2: Custom GPT/Claude Project setup as above

•שלב 3: Manual workflow: drag-drop כל PDF, אסוף JSON. אופציה אוטומטית: Python script ב-LangChain (framework לבניית AI workflows ב-Python) שקורא PDF, sends to API, אוסף JSON

•שלב 4: Concatenate JSONs לטבלה אחת. ב-Excel: paste JSONs, parse עם Power Query. ב-Python: pandas

•שלב 5: Validation - בדוק 10% randomly. אם accuracy 90%+ - good. אם נמוך - re-prompt או manual

התוצאה: spreadsheet עם 50-200 rows, 20-30 columns, מוכן ל-analysis

ROI: 100 papers extraction. Manual: 60 שעות. AI workflow: 10 שעות (3 setup, 5 extraction, 2 validation). חיסכון 50 שעות

כלים:

•ChatGPT Plus + Excel = sufficient for most

•למתקדמים: Python + LangChain + pandas + Pydantic/zod/Instructor (libraries לאימות schema של JSON)

•למחקר commercial: DistillerSR-AI (בתשלום)

workflowspreadsheetautomation

📝 Synthesis - מ-Data ל-Narrative

Synthesis הוא השלב שבו אתם הופכים את הטבלה של 100 papers לסיפור. זה החלק הקשה ביותר וגם הכי human-driven. AI יכול לעזור אך אינו יכול לעשות לבד.

גישות synthesis:

•Quantitative (meta-analysis) - חישוב pooled effect size. דורש statistical software. AI עוזר ב-extraction אך לא ב-statistics

•Qualitative (narrative synthesis) - תיאור thematic של findings. כאן AI עוזר משמעותית

•Mixed methods - שילוב של שניהם

AI ל-narrative synthesis: NotebookLM הוא הכלי הטוב ביותר. הזרימה: Upload כל 50-200 PDFs ל-NotebookLM, ושאל:

•'What are the main themes across these studies?'

•'Where do studies disagree?'

•'What are the methodological limitations?'

•'What are the gaps in evidence?'

כל תשובה עם ציטוטים. אך - ה-AI אינו יכול לעשות critical synthesis לבד. הוא מסכם את מה שכתוב. הוא אינו מבין clinical context, אינו מזהה subtle biases.

ה-human work:

•Critical reading - לקרוא subset של key papers בעצמך

•Contextual interpretation - להבין מדוע findings שונים בין מחקרים

•Clinical relevance

•Theoretical framework

•Limitations of the SR itself

אזהרה גדולה: AI יכול לכתוב synthesis שנשמע מקצועי אך חלול. הסימנים: ביטויים גנריים, חוסר נקודות חדות, אובדן voice. הפתרון: drafting iterative.

שאל ב-NotebookLM לאחר שהעלית כל ה-included papers, 5 שאלות:

•'What are the main themes/findings across these studies?' - מקבל overview thematic

•'Where do studies agree, and where do they disagree? Cite specifically' - מזהה consensus + controversy

•'What are the methodological strengths and limitations of these studies as a body of evidence?' - critical assessment

•'What populations are well-represented and what are underrepresented?' - זיהוי gaps

•'What are open research questions or future directions suggested by this evidence?' - research agenda

כל answer מקבל ציטוטים

כעת human work:

•Read top 5 papers in depth

•Compare AI synthesis to your reading

•Add critical insights AI missed

•Re-prompt AI with specific questions ('What about subgroup X?')

התוצאה: synthesis-draft עם human voice + AI breadth

NotebookLMsynthesisquestions

AI יכול לזהות:

•Funding bias - 'supported by Pfizer' שכתוב ב-paper

•Conflict of interest - declared COI

•Sample size limitations - 'n=15' בקטן

•Selection bias שעולה ב-text - 'convenience sample'

•Reporting bias - 'primary outcome not reported'

AI אינו מזהה:

•Subtle methodological flaws - randomization quality, blinding adequacy

•Statistical issues - data dredging, multiple comparisons ללא adjustment

•Author conflicts שלא הוצהרו

•Publication bias - אינו רואה את ה-unpublished papers

•Spin - טוויסט חיובי ב-titles/abstracts כשה-data אינו תומך

דוגמה: paper ש-primary endpoint missed אך abstract אומר 'promising results'

RoB-2 Cochrane tool הוא הסטנדרט ל-bias assessment. AI יכול לסייע ב-domains 1-3 אך לא בכל ה-domains

מחקרי validation ב-2024-2025: GPT-4 הגיע ל-RoB-2 accuracy של כ-60-75% מול expert humans - sufficient כעזר, לא כתחליף. Manual remains gold standard

biasRoB-2limitations

התהליך:

•שלב 1 (AI - 30 דקות): NotebookLM draft של סעיף discussion על themes

•שלב 2 (אדם - 60 דקות): קריאה ותיקונים

•שלב 3 (AI - 15 דקות): polish (Claude Opus 4.7 או GPT-5.5)

•שלב 4 (אדם - 30 דקות): final read

תיקונים בשלב 2:

•הוסף clinical insight שלא נמצא ב-papers

•חבר ל-broader knowledge

•הסר generic sentences ('further research needed')

•הוסף specific quantitative comparisons

•הסר irrelevant points

Prompt לשלב 3: 'Polish this discussion - improve flow, fix grammar, ensure consistency in terminology. Don't add new content'

שלב 4: fact-check numbers. validate citations. סגנון personal

ROI: 2-3 שעות לעומת 10+ ידני. איכות: לעיתים טובה יותר מ-pure manual (AI catches gaps שאדם מפספס)

אזהרה: disclosure - אם השתמשת ב-AI ב-drafting, רוב ה-journals דורשים disclosure. לפי ICMJE (עדכון ינואר 2026, Section V חדש): 'authors who use AI-assisted technologies should describe in both the cover letter and the submitted work how they used it'

draftingiterativeאדם+AI

✍️ Writing Tools - Grammarly, Trinka, Paperpal, Claude

ההבדל בין draft לטקסט מצוין הוא לעיתים editing וה-house style. כלי AI writing assistance מסייעים.

Grammarly הוא הוותיק (2009). התפתח מ-grammar checker ל-AI writing assistant. גרסה Premium (כ-30 דולר לחודש) כוללת:

•Tone detection

•Plagiarism check

•AI rewriting

במחקר אקדמי - מצמצם את ה-noise (typos, grammar errors), אך אינו מכיר terminology רפואית.

Trinka AI הוא specialized ל-academic writing. תכונות:

•Academic tone enforcement

•Consistency check (US vs UK English)

•Field-specific terminology (medical, biomedical)

•Plagiarism detection

•Journal-specific style

עדיף על Grammarly למאמרים רפואיים. Pricing 2026: החל מ-$6.67/חודש (annual billing) לתוכנית הבסיסית, עד ~$20 לחודש לתוכניות מתקדמות. קיימת גם גרסה חינמית מוגבלת. Writefull (Cambridge-based) הוא מינימליסטי. גרסה חינמית מספקת. Paperpal (Cactus Communications, 2022) דומה ל-Trinka, ב-2026 Prime: $25 לחודש או $139 לשנה.

ChatGPT ו-Claude יכולים לעשות את כל מה שתואר לעיל - 'Polish this paragraph for academic publication', 'Fix grammar and improve clarity'. יתרון: יותר flexible. חיסרון: אתה מנהל את ה-prompts.

השוואה לפי משימה:

•Grammar/typos - Grammarly או Trinka הכי מהיר

•Academic tone - Trinka או Paperpal

•Major rewrite - Claude Opus 4.7 / GPT-5.5

•Plagiarism check - Grammarly Premium או Turnitin

•Journal-specific - Paperpal או Trinka

שגיאות שכיחות בכתיבה רפואית:

•Tense inconsistency

•Active vs passive

•Hedging

•Wordiness

•Anglicized terms

•Citation formatting

אזהרה: AI rewriting יכול לשנות meaning בעדינות. ייחוד הבעיה: numbers, drug names, measurements. אימות לאחר כל AI edit מומלץ.

Grammarly: כללי. טוב ל-day-to-day writing (emails, drafts). חוזק: speed, ubiquity, ב-Browser/Word/Outlook. חולשה: אינו specialized. תיקון: 'patient' שגוי לעיתים, terminology רפואית אינה מובנת. Premium: כ-30 דולר לחודש

Trinka: academic-focused. תיקון: tone, terminology medical, US/UK. journal styles. consistency in references. כולל plagiarism checker. UI: less polished from Grammarly. ב-2026: החל מ-$6.67/חודש (annual) לתוכנית בסיסית, עד ~$20/חודש למלא. הטוב ביותר למאמר

Paperpal: similar to Trinka. פותח ב-Cactus Communications. plagiarism, grammar, journal styles, similarity check. ב-2026 Prime: $25 לחודש, $55 לרבעון, או $139 לשנה. UI נחמד

כל ה-3: Word add-in, Web app, Browser extension

בחירה: ל-day-to-day writing - Grammarly. למאמר אקדמי - Trinka או Paperpal. ל-power user שאינו רוצה לשלם על כלי ייעודי - Claude Pro או ChatGPT Plus (כ-20 דולר לחודש) יותר flexible

comparisonGrammarlyTrinka

Editing prompt template: 'אתה medical journal editor. ערוך את הפסקה הבאה לפרסום ב-JAAD. כללים: Active voice preferred. Past tense ל-methods, present tense ל-results. AMA citation style. Hedging מקצועי כשהראיות לא חד-משמעיות. הימנע מ-jargon ו-redundancy. Don't change numbers, dates, citations, drug names. הצג את הגרסה המתוקנת + רשימת השינויים שביצעת. Paragraph: paste'

התוצאה: גרסה מתוקנת + explanation. אפשר ל-iterate

עוד prompts useful: 'Rephrase this paragraph for better flow without changing meaning'. 'Identify any grammatical errors or awkward phrasings'. 'Check tone - is it appropriate for academic medical writing?'. 'Compare two versions and tell me which is better'

יתרון על Grammarly/Trinka: יותר flexible, מתחשב ב-whole-paragraph context, יודע medical terminology אם prompted

חיסרון: צריך לזכור prompts. הפתרון: prompt library שמור

ChatGPTClaudeeditor

6 קטגוריות:

•Tense inconsistency: 'We performed the study and find...' -> 'We performed the study and found...'. Methods past, Results past, Discussion present

•Active vs passive: 'The study was conducted by us' -> 'We conducted the study'. מודרני: active

•Hedging level: 'Drug X cures' (over-claim) -> 'Drug X showed efficacy in...'. AI catches over-claims

•Wordiness: 'in order to ensure that' -> 'to ensure'

•Specific numbers: 'approximately 25' -> '25 (95% CI 23-27)'. כתיבה אקדמית דורשת precision

•Anglicized Hebrew: writing in English under Hebrew thinking. דוגמה: 'the patient that was treated' -> 'the patient who was treated'

ההבדל: ה-3 הראשונים grammar/style - AI catches. ה-3 האחרונים content/precision - דורשים אדם בנוסף ל-AI

errorsgrammaracademic

⚖️ Ethics, Disclosure, Detection - הצד החברתי

השימוש ב-AI במחקר מעלה שאלות אתיות חמורות. ICMJE (International Committee of Medical Journal Editors) פרסם הנחיות בעדכון ינואר 2024, הרחיב אותן ב-2025, ובעדכון ינואר 2026 הוסיף Section V ייעודי ל-AI.

3 כללי ICMJE (2026):

•AI אינו יכול להיות מחבר. הסיבה: מחבר נושא באחריות, AI אינו יכול לקחת אחריות על integrity, accuracy, originality

•חובת disclosure של כל שימוש ב-AI במחקר או כתיבה - גם ב-cover letter וגם במאמר עצמו (תיאור איך השתמשו)

•האחריות לתוכן נשארת אצל המחברים - חובת review ועריכה

עדכוני 2026 נוספים: strengthened data access requirements - לפחות מחבר אחד עם access ל-primary dataset ב-collaborative research.

פורמט disclosure לדוגמה:

'We used Claude Opus 4.7 (Anthropic) for draft polishing and GPT-5.5 (OpenAI) for grammar editing. All AI output was reviewed and edited by the authors'

ICMJE 2026 ספציפית: AI-generated references אסורות. WAME (World Association of Medical Editors, Sept 2023, מתעדכן באופן שוטף) - גישה דומה. Nature/Springer ספציפית: 'AI tools may be used as long as they are acknowledged in the methods/acknowledgements'. Lancet, JAMA: similar.

Detection of AI-generated text:

•GPTZero (founded by Princeton student Edward Tian, Jan 2023) - claimed accuracy 95.7% ב-2026, ~1% false positives על polished text

•Originality.ai - ~95% accuracy ב-2026 benchmarks

•Turnitin AI detection (2023) - הקפדני ביותר ב-2026 tests

אך - dropping ל-60-80% accuracy על paraphrased או heavily edited content. False positives גבוהים יותר על non-native English speakers. מחקר Liang et al, Patterns 2023: GPTZero ו-detectors דומים flagged מעל 50% מ-TOEFL essays של non-native writers כ-AI - לא אמין. ב-2026, הסטנדרט הוא trust + disclosure, לא detection.

Plagiarism vs AI use: שונים. Plagiarism = העתקה ללא ציטוט. AI use = שימוש בכלי. שניהם דורשים disclosure אך באופנים שונים.

גבולות:

•Reasonable: AI ל-grammar editing, idea generation, summarization

•Borderline: AI ל-drafting full paragraphs (need disclosure + heavy editing)

•Unacceptable: AI מייצר fabricated data, AI ב-peer review ללא disclosure, paste full AI text without editing

3 templates לפי מידת השימוש:

Template 1 (Light use - grammar editing):

'The authors used Grammarly Premium and Trinka AI for language polishing of the manuscript. No AI tools were used for content generation, data analysis, or interpretation. The authors are solely responsible for the integrity of the work'

Template 2 (Moderate use - literature, drafting):

'The authors used the following AI tools in this work: NotebookLM (Google) for literature synthesis assistance. Claude Opus 4.7 (Anthropic, released April 2026) for first-draft generation of the discussion section. GPT-5.5 (OpenAI) for grammar editing. All AI-generated content was reviewed, fact-checked against primary sources, and substantially edited by the authors'

Template 3 (Heavy use - SR/extraction):

'The systematic review process incorporated AI assistance: ASReview Lab for screening (95%+ recall validated). Claude Opus 4.7 for data extraction (90% accuracy, 100% manually validated). GPT-5.5 for narrative synthesis assistance. All AI outputs were validated by N reviewers independently. PRISMA-AI flow diagram is provided as Supplementary Figure X'

הגישה:

•Be specific

•Be honest

•Be accountable

disclosuretemplatesethics

כלי detection:

•GPTZero - founded Jan 2023 על ידי Edward Tian (Princeton student). גרסה חינמית. אלגוריתם: perplexity (מדד לחיזוי - טקסט אנושי vs AI) + burstiness (מדד לסדר במשפטים - אנושי = יותר variance). 2026 claims: 95.7% detection, ~1% false positive על polished text. בפועל: 88-95% accuracy על raw AI text, נופל ל-60-80% על paraphrased/edited

•Originality.ai - commercial. 2026: ~95% accuracy בbenchmarks, גבוה false positive על academic writing

•Turnitin AI Detection (2023) - integrated ב-Turnitin. ב-2026 הקפדני ביותר ב-tests. Suppresses scores מתחת ל-20%

False positives: עולה על non-native English speakers (Liang 2023, Patterns). מחקרים נוספים מצאו false positive rates של 25-30% על TOEFL essays של non-native writers. 1-2% על polished native text - גבוה יותר על academic style formal

False negatives: מודלים חדשים (GPT-5.5, Claude Opus 4.7) פחות AI-sounding. ב-2026 tests, accuracy hovers 88-95% על raw AI text אך נופל ל-60-80% על heavily paraphrased או edited content

הבעיה הבסיסית: ככל שהמודלים משתפרים, הם דומים יותר לאדם. Detection הופכת חסרת תועלת

מסקנת academia ב-2025-2026: לא לסמוך על detection. במקום: trust + disclosure. אם מחבר מצהיר על שימוש ב-AI - accept. אם לא - assume good faith. נתפס ב-fabrication - retract

אין הולכים לכיוון של AI testing כמו anti-doping. בישראל: same approach

detectionGPTZerounreliable

Fully acceptable:

•Grammar/typo correction (Grammarly)

•Literature search assistance (PubMed Best Match)

•Citation formatting (Zotero)

•Idea brainstorming

•Structure/outline generation

Acceptable with disclosure:

•Drafting first version of paragraphs (heavily edited after)

•Data extraction (validated)

•Synthesis assistance (NotebookLM)

•Translation (with validation)

Borderline (must disclose, careful):

•AI generates major sections without major editing

•AI does statistical analysis (need to validate code)

•AI selects studies for inclusion (without human override)

Unacceptable:

•AI fabricates data

•AI generates entire manuscript without disclosure

•AI used for peer review without disclosure

•Citations from AI without verification

•Image generation that misleads

Red flags:

•'regenerate response' text left in

•Citations to non-existent papers

•Numbers don't match between sections

•Generic non-specific writing

•Submitting AI text as own

הקו המנחה: AI is tool that augments your work. AI is not a ghost writer

ethicsacceptablered-flags

✅ Take-aways לדרמטולוג חוקר ב-2026

ההמלצה המעשית: stack של 4-5 כלים יספיק לרוב הצרכים.

ההצעה:

•PubMed (חינמי) ל-search ראשוני

•Zotero (חינמי) ל-reference management

•NotebookLM (חינמי) ל-synthesis ו-deep dives

•Claude Pro או ChatGPT Plus (כ-20 דולר לחודש) ל-drafting, polishing, prompt engineering

•Trinka AI (כ-20 דולר לחודש) או Paperpal (כ-25 דולר לחודש) ל-academic editing

סך: 40-60 דולר לחודש, מספק 95% מצרכי research.

הזרימה במחקר טיפוסי:

•Question formulation - ChatGPT brainstorm

•PubMed search - basic + MeSH + alerts

•Reference management - Zotero collection

•Screening - אם SR גדול: ASReview/Rayyan. אם narrative review: ידני עם AI assist

•Reading - NotebookLM upload, Q&A, summaries

•Data extraction - אם SR: Claude Project עם schema

•Synthesis - NotebookLM

•Drafting - Claude Pro

•Editing - Trinka או Claude

•Submission - validate disclosure statement

בישראל ספציפית: רוב המחקרים שלך יהיו clinical (case series, retrospective studies) או review-based. SR גדול - פחות שכיח. עברית: אם המאמר באנגלית (כל הכנסים והעיתונים המדעיים הבינלאומיים) - אנגלית throughout. אם המאמר ב-Harefuah - drafting בעברית, polishing באנגלית ולאחר מכן תרגום.

ההמלצה: תרגול. שעה ביום למשך חודש = transformation. התוצאה: המחקר האקדמי שלך הופך ל-50% יותר יעיל.

שבוע 1 - Setup:

•יום 1-2: Zotero + browser extension. הוסף 20 papers שברשותך על מנת להכיר את הכלי

•יום 3-4: NotebookLM. צור 2 notebooks על נושאי המחקר שלך. שאל 5 שאלות בכל אחד

•יום 5-7: הרשמה ל-Claude Pro או ChatGPT Plus. תרגל 5 prompts ל-academic writing

שבוע 2 - Search workflow:

•יום 8-10: PubMed advanced - MeSH, filters, alerts. הקם 3 alerts על הנושאים שלך

•יום 11-14: integration. מ-PubMed search ל-Zotero ול-NotebookLM. הרץ workflow על נושא חדש

שבוע 3 - Drafting:

•יום 15-18: כתוב section אחד של paper שלך עם AI assistance. NotebookLM ל-related work, Claude ל-drafting, Trinka ל-polishing

•יום 19-21: review. אילו prompts עבדו? אילו הספיקו? בנה prompt library

שבוע 4 - Optimization:

•יום 22-25: שיפור. הוסף Trinka או Paperpal ל-editor. נסה Claude Project עם 30 papers

•יום 26-28: review final

הסיכום: 28 שעות השקעה ומעל 100 שעות חיסכון בשנה

bootcampplan4-weeks

6 טעויות שכיחות:

•אבקש מ-AI לעשות הכל. טעות: AI הופך הכל ל-generic. תיקון: AI ל-mechanical, אדם ל-intellectual

•איני צריך ליצור disclosure - זה רק polish. טעות: ICMJE דורש disclosure גם ל-light use. תיקון: always disclose. better safe

•AI יודע יותר ממני - אסמוך עליו. טעות: AI hallucinates. תיקון: validate every fact, especially numbers and citations

•אעשה את כל ה-screening עם AI. טעות: AI אינו מבין clinical context fully. תיקון: AI screens 80%, human reviews edge cases

•אם זה ב-AI, זוהי הוכחה. טעות: AI מחזיר probable, לא correct. תיקון: AI is starting point, human verification is endpoint

•איני צריך לקרוא papers כי AI סיכם. טעות: AI summary מחמיץ ניואנס. תיקון: read top 5 key papers ב-depth, AI ל-broader context

המסקנה: AI is augmentation, not replacement. השמירה על critical thinking היא ה-essence

mistakescommonavoid

5 פעולות קונקרטיות:

•הירשם ל-NotebookLM (חינמי). הצעד הכי שווה. תוך שבועיים תרגיש את ההבדל ב-research speed

•הקם Zotero (חינמי) אם עדיין אין לך. browser extension + Word integration. כל מאמר שאתה קורא -> 1-click לאחסון

•הקצה שעה ביום למשך חודש לרכישת ה-stack. אל תנסה ללמוד הכל בבת אחת

•צור prompt library אישי. תיקייה ב-Notion/OneDrive עם 10 prompts השימושיים ביותר: literature query, drafting, polishing, DDx, extraction

•Disclosure תמיד. שמור template של disclosure statement מוכן. Simply add ל-acknowledgements של כל מאמר חדש שאתה כותב

בונוס: עקוב אחרי @AnthropicAI, @OpenAI, AI in research blogs ב-X. 10 דקות ביום של scrolling = updates על trends

קצב השינוי: ב-2026 כל 2-3 חודשים יש tool חדש או feature important. עדיף לדעת

takeawaysaction2026

🔑

שורה תחתונה

מסלול העבודה המלא של מחקר אקדמי עם AI ב-2026: • PubMed search • Zotero management • NotebookLM synthesis • ChatGPT/Claude drafting • Trinka polishing כל שלב חוסך 50-80% זמן. AI אינו מחליף critical thinking - הוא מאיץ את ה-mechanical work. ICMJE דורש disclosure מלא. Detection tools אינם אמינים - הסטנדרט הוא trust + disclosure. ASReview ל-screening (80% חיסכון). Custom GPT/Claude Project ל-extraction (90% accuracy עם אימות של 10%).

הכיוון לדרמטולוג חוקר בישראל ב-2026: Stack של 40-60 דולר לחודש (Claude Pro + Trinka + free tools) מספיק לרוב הצרכים. שעה ביום למשך חודש = workflow שעובד. ROI: מעל 100 שעות חיסכון בשנה. Disclosure תמיד. אימות תמיד. Read 5 key papers in depth ידנית - AI ל-broader context. בעברית: hybrid workflow (אנגלית primary, עברית translation). העתיד: AI יהפוך מ-augmentation ל-collaboration. אך ב-2026 - augmentation היא ה-state.

📋 פרטי מקור ומחבר

מאת: ד"ר יהונתן קפלן

מומחה ברפואת עור ומין | מנתח מוז (FACMS)

📅 פורסם: 1.5.2026🔄 עודכן: 1.5.2026

מבוסס על:

Artificial Intelligence Hallucinations in Anaesthesia: Causes, Consequences and Countermeasures

Salvagno M, Taccone FS, Gerli AG

Critical Care, 2023

DOI: 10.1186/s13054-023-04473-y PMID: 37170013

הערת עריכה: תוכן זה נכתב ונערך על ידי ד"ר יהונתן קפלן ומבוסס על המאמר המקורי.

אין להסתמך על תוכן זה ללא קריאת המקור המלא.