Skip to content

jsh574/ResearchMate-R

Repository files navigation

ResearchMate-R

ResearchMate-R is a safety-first research automation toolkit for literature survey, idea generation, experiment execution, result plotting, paper writeup, paper review, and reproducibility preparation.

It is designed as a human-in-the-loop research companion: the system can search papers, draft plans, inspect source code, generate reports, and run conservative checks, while risky actions such as cloning repositories or executing external code remain opt-in.

Maintainer: jsh <[email protected]>

Highlights

  • Unified research CLI: one entry point for survey, reproduction, ideation, experiments, plotting, writeup, and review workflows.
  • Interactive mode: run python launch_researchmate.py and select a function from a terminal menu.
  • Literature survey agent: plans queries, retrieves papers, filters relevance, writes a cited survey, and exports BibTeX.
  • Paper reproduction assistant: finds source candidates, selects likely code, writes a reproduction plan, and generates an auditable report.
  • Safe-by-default execution: no cloning and no code execution unless explicit flags are provided.
  • Local-source workflow: analyze manually downloaded repositories or supplementary material without relying on GitHub availability.
  • Experiment-generation pipeline: supports idea-driven research execution, tree-search logs, plot aggregation, paper drafting, and review loops.
  • Multi-model backend: supports DeepSeek, OpenAI-compatible models, Anthropic Claude, Gemini, Bedrock, Vertex AI, and Ollama-style local models when configured.

What It Can Do

Task Command Main outputs
Literature survey researchmate survey survey.md, references.bib, papers.json, search_plan.json
Paper reproduction planning researchmate reproduce source candidates, selected source, reproduction plan, run log, report
Research idea generation researchmate idea generated idea JSON
Agentic experiment workflow researchmate experiment experiment workspaces, logs, plots, writeups, reviews
Plot aggregation researchmate plot final analysis scripts and figures
Paper writeup researchmate writeup LaTeX paper draft
LLM paper review researchmate review structured review JSON
VLM figure review researchmate vlm-review figure/caption/reference review JSON
Environment check researchmate doctor dependency, API-key, and upload-safety diagnostics

Installation

ResearchMate-R targets Python 3.10+ on Linux.

conda create -n researchmate python=3.10
conda activate researchmate
pip install -r requirements.txt

Optional editable installation:

pip install -e .

The repository name is ResearchMate-R; the importable Python package is researchmate_r.

API Keys

Create a local .env file from the template:

cp .env.example .env

Fill only the keys needed for your selected model or workflow:

DEEPSEEK_API_KEY=...
DEEPSEEK_BASE_URL=https://api.deepseek.com
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
S2_API_KEY=...
GITHUB_TOKEN=...

Do not commit .env. It is ignored by .gitignore.

Quick Start

Interactive menu

The simplest way to use ResearchMate-R is the interactive CLI:

python launch_researchmate.py

You will see:

=== ResearchMate-R Interactive CLI ===
1. survey      Literature survey
2. reproduce   Paper reproduction and source verification
3. experiment  Experiment-generation workflow
4. idea        Research idea generation
5. plot        Final plot aggregation
6. writeup     LaTeX paper writeup
7. review      LLM paper review
8. vlm-review  Vision-language figure/caption/reference review
9. doctor      Environment and upload-safety checks
0. exit

Command mode

For scripts and reproducible runs, call subcommands directly:

python launch_researchmate.py --help
python launch_researchmate.py doctor

If installed with pip install -e ., the same interface is available as:

researchmate
researchmate --help

Common Workflows

1. Literature survey

Generate a cited survey for a topic:

python launch_researchmate.py survey \
  --topic "trustworthy reinforcement learning for LLMs" \
  --output-dir literature_surveys/trustworthy_rl_llms \
  --model deepseek-chat \
  --num-queries 4 \
  --max-results-per-query 5 \
  --max-papers 12 \
  --language English \
  --report-words 2500

Typical outputs:

literature_surveys/<topic>/
  survey.md
  references.bib
  papers.json
  search_plan.json

2. Paper reproduction planning

Default mode is non-invasive: it searches and plans, but does not clone repositories or run code.

python launch_researchmate.py reproduce \
  --paper-title "Attention Is All You Need" \
  --output-dir reproductions/attention_is_all_you_need \
  --model deepseek-chat \
  --max-results 5

Typical outputs:

reproductions/<paper>/
  paper_metadata.json
  source_candidates.json
  source_selection.json
  reproduction_plan.md
  run_log.json
  reproduction_report.md
  generated_reimplementation/

To allow cloning a selected GitHub repository:

python launch_researchmate.py reproduce \
  --paper-title "Attention Is All You Need" \
  --allow-clone

To analyze a local repository or supplementary-material folder:

python launch_researchmate.py reproduce \
  --paper-title "Your Paper Title" \
  --local-source /path/to/local/source \
  --offline \
  --allow-run

--allow-run currently enables conservative smoke checks such as:

python -m compileall -q .

It does not install dependencies, download model checkpoints, launch training, or run arbitrary project scripts.

3. Research idea generation

Generate ideas from a workshop or topic description:

python launch_researchmate.py idea \
  --workshop-file researchmate_r/ideas/i_cant_believe_its_not_better.md \
  --output-file researchmate_r/ideas/generated_ideas.json \
  --model deepseek-chat \
  --max-num-generations 1 \
  --num-reflections 5

4. Experiment-generation workflow

Run the existing experiment-generation pipeline by forwarding arguments after experiment --:

python launch_researchmate.py experiment -- \
  --load_ideas researchmate_r/ideas/quick_bfts_smoke.json \
  --idea_idx 0 \
  --model_writeup deepseek-chat \
  --model_writeup_small deepseek-chat

This workflow can load ideas, prepare an experiment workspace, run staged agentic search, aggregate plots, draft papers, and run review modules. Full runs may require CUDA, dataset-specific dependencies, LaTeX tools, and additional Python packages.

The tree-search defaults are stored in:

bfts_config.yaml

5. Plot aggregation

Aggregate final plots for an experiment folder:

python launch_researchmate.py plot \
  --folder experiments/<experiment-folder> \
  --model deepseek-chat \
  --reflections 5

6. Paper writeup

Generate a LaTeX writeup from an experiment folder:

python launch_researchmate.py writeup \
  --folder experiments/<experiment-folder> \
  --type icbinb \
  --model deepseek-chat \
  --big-model deepseek-chat \
  --writeup-reflections 3

Supported writeup types:

  • icbinb: compact paper-style writeup.
  • normal: standard longer writeup.

7. LLM and VLM review

Run an LLM-based review for a paper PDF:

python launch_researchmate.py review \
  --pdf experiments/<experiment-folder>/latex/template.pdf \
  --output-json experiments/<experiment-folder>/review.json \
  --model deepseek-chat

Run a VLM-based figure, caption, and reference review:

python launch_researchmate.py vlm-review \
  --pdf experiments/<experiment-folder>/latex/template.pdf \
  --output-json experiments/<experiment-folder>/vlm_review.json \
  --model gpt-4o-mini-2024-07-18

Safety Model

ResearchMate-R is designed to keep high-risk operations explicit.

  • No clone by default: pass --allow-clone before ResearchMate-R clones a selected GitHub repository.
  • No code execution by default: pass --allow-run before conservative smoke checks are run.
  • Local source supported: pass --local-source to inspect a directory you already downloaded.
  • Offline mode supported: pass --offline to skip external paper/source searches.
  • Generated artifacts are ignored: output directories such as literature_surveys/, reproductions/, and experiments/ are ignored by Git.

Always inspect generated plans, reports, code, and reviews before using them in research.

Environment Check

Run:

python launch_researchmate.py doctor

This reports:

  • Paths: project root and package path.
  • Python dependencies: key packages required by survey, reproduction, experiment, PDF, and review workflows.
  • Environment variables: whether API keys are set.
  • Upload safety: whether generated-output directories and .env exist.

If doctor reports missing packages such as funcy, fitz, pymupdf, or pymupdf4llm, install the full requirements:

pip install -r requirements.txt

Project Structure

ResearchMate-R/
  launch_researchmate.py              # unified CLI entry point
  launch_literature_survey.py         # standalone survey launcher
  launch_paper_reproduction.py        # standalone reproduction launcher
  launch_scientist_bfts.py            # experiment-generation launcher
  bfts_config.yaml                    # tree-search experiment defaults
  requirements.txt                    # Python dependencies
  pyproject.toml                      # package metadata and console scripts

  researchmate_r/
    cli.py                            # interactive and subcommand CLI
    llm.py                            # LLM client factory
    vlm.py                            # VLM client factory
    perform_literature_survey.py      # literature survey workflow
    perform_paper_reproduction.py     # paper reproduction workflow
    perform_ideation_temp_free.py     # research idea generation
    perform_plotting.py               # plot aggregation
    perform_writeup.py                # standard paper writeup
    perform_icbinb_writeup.py         # compact paper writeup
    perform_llm_review.py             # LLM review
    perform_vlm_review.py             # figure/caption/reference review
    tools/semantic_scholar.py         # scholarly search utilities
    treesearch/                       # agentic experiment infrastructure

Generated outputs are intentionally not tracked:

literature_surveys/
reproductions/
experiments/
logs/
workspaces/
cache/
huggingface/

Troubleshooting

  • Missing API key

    • Set the relevant key in .env or your shell, for example DEEPSEEK_API_KEY or OPENAI_API_KEY.
  • experiment fails with missing funcy

    • Install full dependencies with pip install -r requirements.txt.
  • PDF review fails with missing PyMuPDF packages

    • Install full dependencies with pip install -r requirements.txt.
  • LaTeX PDF compilation fails

    • Install a LaTeX distribution that provides tools such as pdflatex and bibtex.
  • GitHub source search is unstable

    • Use --local-source with a manually downloaded repository or supplementary-material directory.

Development Checks

Useful local checks:

python -m py_compile researchmate_r/cli.py
python launch_researchmate.py --help
python launch_researchmate.py doctor

Before pushing to GitHub:

git status --short
git check-ignore -v .env literature_surveys reproductions experiments cache huggingface

Make sure .env, generated experiment outputs, model checkpoints, datasets, and large downloaded artifacts are not tracked.

Responsible Use

ResearchMate-R may call external LLM APIs, search scholarly services, inspect source code, and optionally run conservative local checks. Its outputs can contain mistakes, hallucinated claims, incomplete citations, or incorrect reproduction assumptions.

Use it as an assistant, not as an authority. Review all generated artifacts before relying on them in research, writing, or evaluation.

License

See LICENSE for the project license and responsible-use restrictions.

About

A research agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors