Towards Fully Automated Systematic Reviews

In this post, I’m sharing my journey to build the ultimate paper summarizer agent — a lightweight, command-line tool powered by your favourite LLMs.

Writing a review paper sounded simple at first. Read a bunch of papers, extract key findings, summarize, and synthesize.

Right?

Wrong.

What I thought would be a straightforward academic task quickly became a time-consuming slog. Manually finding the right papers, understanding their content, and organizing them into a cohesive review was far harder than expected. It felt like I was reinventing the wheel every time I sat down to write.

That’s when I had an idea:

What if I built a tool that could do this grunt work for me?

  • 🔍 Find relevant papers based on search query
  • 🧹 Filter them using inclusion criteria
  • 📖 Read and summarize each one into a structured format
  • 🧾 Eventually help generate the scaffolding of a review paper

20x10


The Vision: A Researcher’s Dream Agent

I imagine the ultimate summarizer agent would work like this:

🔹 Input:

  • A general research topic (e.g. “gene regulatory networks in synthetic biology”)
  • Optional filters like date, species, or data type
  • A preferred output format (CSV, table, markdown, etc.)

🔹 Output:

  • A summary table of key metadata from each paper
  • Optional Highlighted findings or quotes
  • Possibly even draft review sections, organized by themes
  • Talking with the body of papers

Phase 1: Systematic review Agent Using LangChain + OpenAI

To start, I tackled the hardest piece, and the most useful-one for me at least: screening papers and extracting relevant data from full-length PDF papers.

🛠 Tech Stack:

  • LangChain: to orchestrate chains and agents
  • Pydantic: to validate at the field and model level
  • OpenAI GPT-4: to extract structured insights

🧑‍💻 Try It Yourself

I’ve open-sourced the full project here:

Research Agent Logo

🔗 View the GitHub Repository

Key Features

  • 🔍 Retrieve papers directly from PubMed Central
  • 🧹 Screen abstracts with GPT-4o against your custom objectives
  • 📊 Extract structured metadata (species, algorithms, results, etc.)
  • 🧾 Export clean JSON/CSV summaries for analysis

🚀 Example Run

```bash research_agent analyze
–task meta
–objective “AI for diagnosing UTIs in dogs”
–filter-papers True
–max-results 50
–output-dir ./output


© 2022. Makan Farhoodi.

Powered by Hydejack v9.2.1