Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions LOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Release Log

## 0.1.3

- Added a core-code blueprint step before generated files are cached for export.
- Reworked code generation guidance to avoid fixed full-project templates and focus on the paper's minimal computational contribution.
- Added local validation so generated code files must exactly match the blueprint file list.
- Added blueprint metadata to cached code bundles and export README content.
- Added MiniMax and GLM model providers alongside DeepSeek and Jiekou.

## 0.1.2

- Added Jiekou as a model provider alongside DeepSeek.
Expand Down
35 changes: 26 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,45 @@

English | [简体中文](README.zh-CN.md)

Paper2CoreCode is a desktop tool that turns research papers into readable summaries and exportable core code.
Paper2CoreCode is a desktop tool that turns research papers into readable summaries and exportable minimal core code.

It is designed for researchers, engineers, and students who want to quickly understand a paper and, when possible, obtain a componentized implementation scaffold.
It is designed for researchers, engineers, and students who want to quickly understand a paper and, when possible, obtain a small implementation of the paper's core computational contribution.

## What It Does ✨

- 📄 Analyze academic paper PDFs.
- 🧠 Generate structured paper summaries with DeepSeek / Jiekou.
- 🧠 Generate structured paper summaries with DeepSeek / Jiekou / MiniMax / GLM.
- 🧮 Render Markdown, tables, and LaTeX formulas clearly.
- 💻 Decide whether the paper needs core code.
- 🧭 Plan a minimal core-code blueprint before writing files.
- 📦 Export generated core code as a local project folder.
- 🌐 Switch between Chinese and English UI/output.
- 🖥️ Run as a local Electron desktop app.

## Core Workflow 🚀

1. Select a provider (DeepSeek / Jiekou) and enter your API key in the sidebar.
1. Select a provider (DeepSeek / Jiekou / MiniMax / GLM) and enter your API key in the sidebar.
2. Choose a model.
3. Select a paper PDF.
4. Start analysis.
5. Read the streamed summary.
6. Download generated core code if available.
6. If code is applicable, the model first plans the smallest file set needed for the paper's core contribution.
7. Download generated core code if the blueprint and files pass local validation.

## Core Code Generation

Paper2CoreCode is intentionally not a full experiment-reproduction generator. It aims to export only the smallest reusable code needed to represent the paper's core computational contribution.

Before code is cached for download, the model must produce a core-code blueprint that describes:

- The inferred paper domain.
- The core contribution to implement.
- The minimal implementation boundary.
- The exact files to generate.
- The purpose and main symbols for each file.
- Items intentionally omitted because they are not part of the core contribution.

Generated files must match the blueprint exactly. Extra files are rejected, missing blueprint files are rejected, and unsafe paths are rejected. This helps avoid over-generating training scripts, datasets, baselines, experiment runners, or full application pipelines when the paper only proposes a smaller method such as a loss, module, dispatch rule, signal-processing algorithm, controller, estimator, or objective function.

## Model Providers

Expand All @@ -42,13 +59,13 @@ Some Jiekou GPT variants are shown as unsupported and disabled in the model sele

Pull requests run `npm run build` on Windows, macOS, and Linux through GitHub Actions.

Version tags like `v0.1.2` trigger the release workflow, which builds platform packages for Windows, macOS, and Linux.
Version tags like `v0.1.3` trigger the release workflow, which builds platform packages for Windows, macOS, and Linux.

## Tech Stack 🛠️

- Electron + TypeScript
- React + Vite
- DeepSeek / Jiekou API(OpenAI-compatible
- DeepSeek / Jiekou / MiniMax / GLM APIs (OpenAI-compatible)
- `pdf-parse`
- `react-markdown` + KaTeX
- `electron-builder`
Expand All @@ -61,7 +78,7 @@ Prebuilt packages are published on the [GitHub Releases](https://github.com/lemo
- macOS: `.dmg` and `.zip`
- Linux: `.AppImage` and `.deb`

Release packages are generated automatically when a version tag like `v0.1.0` is pushed.
Release packages are generated automatically when a version tag like `v0.1.3` is pushed.

## Development

Expand Down Expand Up @@ -90,7 +107,7 @@ Build artifacts are generated in `release/`.

- API keys are stored locally in the app user data directory.
- Scanned PDFs without extractable text are not supported yet.
- Generated code is cached locally first, then exported by the user.
- Generated code is cached locally first, then exported by the user after blueprint validation.
- Current desktop builds are unsigned and use the default Electron icon.

## License
Expand Down
35 changes: 26 additions & 9 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,45 @@

[English](README.md) | 简体中文

Paper2CoreCode 是一款桌面端论文阅读与核心代码生成工具,可以把论文 PDF 转换为结构化总结,并在适合复现时导出核心代码项目
Paper2CoreCode 是一款桌面端论文阅读与最小核心代码生成工具,可以把论文 PDF 转换为结构化总结,并在适合实现时导出最小核心代码

它适合研究人员、工程师和学生快速理解论文内容,并获得可继续开发的组件化实现骨架
它适合研究人员、工程师和学生快速理解论文内容,并在可行时获得论文核心可计算贡献的小型实现

## 它能做什么 ✨

- 📄 分析学术论文 PDF。
- 🧠 使用 DeepSeek / Jiekou 生成结构化论文总结。
- 🧠 使用 DeepSeek / Jiekou / MiniMax / GLM 生成结构化论文总结。
- 🧮 清晰渲染 Markdown、表格和 LaTeX 公式。
- 💻 判断论文是否需要生成核心代码。
- 🧭 在写入文件前规划最小核心代码蓝图。
- 📦 将生成的核心代码导出为本地项目文件夹。
- 🌐 支持中文和英文界面/输出切换。
- 🖥️ 作为本地 Electron 桌面应用运行。

## 核心流程 🚀

1. 在侧边栏选择模型供应商(DeepSeek / Jiekou)并配置 API Key。
1. 在侧边栏选择模型供应商(DeepSeek / Jiekou / MiniMax / GLM)并配置 API Key。
2. 选择模型。
3. 选择论文 PDF。
4. 开始分析。
5. 阅读实时流式生成的论文总结。
6. 如果存在核心代码,下载生成的代码项目。
6. 如果适合生成代码,模型会先规划实现论文核心贡献所需的最小文件集合。
7. 当蓝图和文件通过本地校验后,下载生成的核心代码。

## 核心代码生成

Paper2CoreCode 并不是完整实验复现生成器。它的目标是只导出表达论文核心可计算贡献所需的最小可复用代码。

在代码被缓存并允许下载前,模型必须先生成核心代码蓝图,说明:

- 推断出的论文领域。
- 要实现的核心贡献。
- 最小实现边界。
- 需要生成的精确文件列表。
- 每个文件的用途和主要符号。
- 因为不属于核心贡献而故意省略的内容。

生成文件必须与蓝图完全一致。额外文件会被拒绝,缺少蓝图文件会被拒绝,不安全路径也会被拒绝。这有助于避免在论文只提出较小方法时过度生成训练脚本、数据集、baseline、实验运行器或完整应用流水线,例如论文只提出一个 loss、模块、调度规则、信号处理算法、控制器、估计器或目标函数时,只导出对应核心代码。

## 模型供应商

Expand All @@ -42,13 +59,13 @@ API Key 和模型选择会按供应商分别保存在本机应用用户数据目

Pull Request 会通过 GitHub Actions 在 Windows、macOS 和 Linux 上运行 `npm run build`。

推送类似 `v0.1.2` 的版本标签时,会触发 release workflow,并为 Windows、macOS 和 Linux 构建平台安装包。
推送类似 `v0.1.3` 的版本标签时,会触发 release workflow,并为 Windows、macOS 和 Linux 构建平台安装包。

## 技术栈 🛠️

- Electron + TypeScript
- React + Vite
- DeepSeek / Jiekou API(OpenAI-compatible)
- DeepSeek / Jiekou / MiniMax / GLM APIs(OpenAI-compatible)
- `pdf-parse`
- `react-markdown` + KaTeX
- `electron-builder`
Expand All @@ -61,7 +78,7 @@ Pull Request 会通过 GitHub Actions 在 Windows、macOS 和 Linux 上运行 `n
- macOS:`.dmg` 和 `.zip`
- Linux:`.AppImage` 和 `.deb`

当推送类似 `v0.1.0` 的版本标签时,Release 产物会由 GitHub Actions 自动构建并上传。
当推送类似 `v0.1.3` 的版本标签时,Release 产物会由 GitHub Actions 自动构建并上传。

## 开发

Expand Down Expand Up @@ -90,7 +107,7 @@ npm run dist:linux

- API Key 会保存在本机应用用户数据目录中。
- 暂不支持没有可提取文本的扫描版 PDF。
- 生成的代码会先缓存在本地,再由用户主动导出。
- 生成的代码会先在本地通过蓝图校验并缓存,再由用户主动导出。
- 当前桌面端构建未签名,并使用默认 Electron 图标。

## 开源协议
Expand Down
29 changes: 29 additions & 0 deletions src/main/backend/codeCache.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,38 @@ export interface GeneratedFile {
content: string
}

export interface CodeBlueprintFile {
path: string
purpose: string
mainSymbols: string[]
mustInclude: string[]
mustNotInclude: string[]
inputs?: string[]
outputs?: string[]
assumptions?: string[]
evidence?: string
}

export interface CodeBlueprint {
paperDomain?: string
coreContribution: string
minimalImplementationBoundary: string
files: CodeBlueprintFile[]
omitted?: Array<{
item: string
reason: string
}>
minimalityCheck?: {
whyTheseFilesAreMinimal?: string
couldAnyFileBeRemoved?: boolean
overGenerationRisk?: string
}
}

export interface CodeBundle {
readme: string
files: GeneratedFile[]
blueprint?: CodeBlueprint
}

let cached: CodeBundle | null = null
Expand Down
65 changes: 63 additions & 2 deletions src/main/backend/exportCode.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,67 @@
import * as fs from 'fs'
import * as path from 'path'
import { getCachedCodeBundle } from './codeCache'
import { CodeBlueprint, getCachedCodeBundle } from './codeCache'

function escapeMarkdownTableCell(value: string): string {
return value.replace(/\r?\n/g, ' ').replace(/\|/g, '\\|').trim()
}

function formatList(items: string[] | undefined): string {
if (!items || items.length === 0) return 'Not specified.'
return items.map((item) => `- ${item}`).join('\n')
}

function buildBlueprintReadmeSection(blueprint: CodeBlueprint): string {
const generatedFiles = blueprint.files
.map((file) => `| ${escapeMarkdownTableCell(file.path)} | ${escapeMarkdownTableCell(file.purpose)} | ${escapeMarkdownTableCell(file.mainSymbols.join(', '))} |`)
.join('\n')

const omitted = blueprint.omitted && blueprint.omitted.length > 0
? blueprint.omitted
.map((item) => `| ${escapeMarkdownTableCell(item.item)} | ${escapeMarkdownTableCell(item.reason)} |`)
.join('\n')
: '| None specified. | Not applicable. |'

const assumptions = blueprint.files.flatMap((file) => file.assumptions || [])
const minimality = blueprint.minimalityCheck?.whyTheseFilesAreMinimal

return `

## Core Code Scope

This export intentionally contains only the paper's minimal core computational contribution. It does not include experiment reproduction code, baselines, datasets, training scripts, simulators, or full application pipelines unless they are part of the proposed method itself.

### Implemented Core Contribution

${blueprint.coreContribution}

### Minimal Implementation Boundary

${blueprint.minimalImplementationBoundary}

${blueprint.paperDomain ? `### Inferred Paper Domain\n\n${blueprint.paperDomain}\n\n` : ''}### Generated Files

| File | Purpose | Main Symbols |
|---|---|---|
${generatedFiles}

### Intentionally Omitted

| Item | Reason |
|---|---|
${omitted}

### Assumptions

${formatList(assumptions)}

${minimality ? `### Minimality Check\n\n${minimality}\n` : ''}`
}

function buildExportReadme(readme: string, blueprint?: CodeBlueprint): string {
if (!blueprint) return readme
return `${readme.trimEnd()}${buildBlueprintReadmeSection(blueprint)}`
}

function resolveSafePath(rootDir: string, relativePath: string): string | null {
const normalized = relativePath.replace(/\\/g, '/')
Expand Down Expand Up @@ -31,7 +92,7 @@ export function writeCodeFolder(outputDir: string): { ok: true; path: string } |

try {
fs.mkdirSync(outputDir, { recursive: true })
fs.writeFileSync(path.join(outputDir, 'README.md'), bundle.readme, 'utf-8')
fs.writeFileSync(path.join(outputDir, 'README.md'), buildExportReadme(bundle.readme, bundle.blueprint), 'utf-8')

for (const file of bundle.files) {
const target = resolveSafePath(outputDir, file.path)
Expand Down
Loading