Towards Fully Automated Systematic Reviews

07 Apr 2025 in Posts / Development

In this post, I’m sharing my journey to build the ultimate paper summarizer agent — a lightweight, command-line tool powered by your favourite LLMs.

Writing a review paper sounded simple at first. Read a bunch of papers, extract key findings, summarize, and synthesize.

Right?

Wrong.

What I thought would be a straightforward academic task quickly became a time-consuming slog. Manually finding the right papers, understanding their content, and organizing them into a cohesive review was far harder than expected. It felt like I was reinventing the wheel every time I sat down to write.

That’s when I had an idea:

What if I built a tool that could do this grunt work for me?

🔍 Find relevant papers based on search query
🧹 Filter them using inclusion criteria
📖 Read and summarize each one into a structured format
🧾 Eventually help generate the scaffolding of a review paper

20x10

The Vision: A Researcher’s Dream Agent

I imagine the ultimate summarizer agent would work like this:

🔹 Input:

A general research topic (e.g. “gene regulatory networks in synthetic biology”)
Optional filters like date, species, or data type
A preferred output format (CSV, table, markdown, etc.)

🔹 Output:

A summary table of key metadata from each paper
Optional Highlighted findings or quotes
Possibly even draft review sections, organized by themes
Talking with the body of papers

Phase 1: Systematic review Agent Using LangChain + OpenAI

To start, I tackled the hardest piece, and the most useful-one for me at least: screening papers and extracting relevant data from full-length PDF papers.

🛠 Tech Stack:

LangChain: to orchestrate chains and agents
Pydantic: to validate at the field and model level
OpenAI GPT-4: to extract structured insights

🧑‍💻 Try It Yourself

I’ve open-sourced the full project here:

🔗 View the GitHub Repository

Key Features

🔍 Retrieve papers directly from PubMed Central
🧹 Screen abstracts with GPT-4o against your custom objectives
📊 Extract structured metadata (species, algorithms, results, etc.)
🧾 Export clean JSON/CSV summaries for analysis

🚀 Example Run

```bash research_agent analyze
–task meta
–objective “AI for diagnosing UTIs in dogs”
–filter-papers True
–max-results 50
–output-dir ./output

Towards Fully Automated Systematic Reviews

The Vision: A Researcher’s Dream Agent

🔹 Input:

🔹 Output:

Phase 1: Systematic review Agent Using LangChain + OpenAI

🛠 Tech Stack:

🧑‍💻 Try It Yourself

Key Features

🚀 Example Run

Makan Farhoodi

Error

The Vision: A Researcher’s Dream Agent

🔹 Input:

🔹 Output:

Phase 1: Systematic review Agent Using LangChain + OpenAI

🛠 Tech Stack:

🧑‍💻 Try It Yourself

Key Features

🚀 Example Run

Templates (for web app):

Error