Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
9ca145f
feat: concept dedup, compile pipeline refactor, and bidirectional bac…
KylinMountain Apr 10, 2026
901d8bc
feat: brief system, per-page JSON sources, and unified query agent
KylinMountain Apr 10, 2026
0bf7084
fix: default model, API key warning, config and CI improvements
KylinMountain Apr 10, 2026
5a75301
fix: improve init prompts, warning suppression, and CLI polish
rejojer Apr 10, 2026
8e9edeb
feat: improve query agent with multimodal get_image tool
rejojer Apr 10, 2026
44bf83e
refactor: unify image paths and add pymupdf per-page extraction
rejojer Apr 10, 2026
7ca95f9
fix: replace concept body on update instead of appending
rejojer Apr 10, 2026
d41588a
fix: use json_repair for robust LLM JSON parsing
rejojer Apr 10, 2026
7dd70c6
fix: use pdf_path.stem for full_text frontmatter path
rejojer Apr 10, 2026
b90f0b4
fix: sanitize concept names before links and index
rejojer Apr 10, 2026
3dd84f3
fix: pass doc_type and doc_brief in early return paths
rejojer Apr 10, 2026
ef60f7d
fix: sanitize concept name in _gen_update and correct _update_index d…
rejojer Apr 10, 2026
85eaebf
Merge pull request #10 from VectifyAI/bugfix/compile-clean
rejojer Apr 10, 2026
8818ada
fix: update existing concept briefs in index.md instead of skipping
rejojer Apr 10, 2026
aabcf5f
fix: preserve non-ASCII characters in concept name slugs
rejojer Apr 10, 2026
9df6e6c
fix: always replace concept body on update, not only when source is new
rejojer Apr 10, 2026
ef235d2
Fix concept index updates by section
rejojer Apr 10, 2026
3e3d56f
Fix exact concept index row matching
rejojer Apr 10, 2026
ed0d6ba
Fix exact concept index row matching
rejojer Apr 10, 2026
b6f6ba3
Revert "Fix exact concept index row matching"
rejojer Apr 10, 2026
2a15587
Merge pull request #11 from VectifyAI/bugfix/compiler-update-fixes
rejojer Apr 10, 2026
0291ec9
Merge pull request #12 from VectifyAI/dev
rejojer Apr 10, 2026
dc35625
Simplify init prompts and capture API key to .env
rejojer Apr 10, 2026
771452d
Merge pull request #13 from VectifyAI/feat/init-api-key-prompt
rejojer Apr 10, 2026
8c5bc2f
Use cloud OCR for per-page content in cloud mode
rejojer Apr 10, 2026
e0ab3f9
Bump pageindex to 0.3.0.dev1
rejojer Apr 10, 2026
b77e95d
Silence import-time warnings from pydub
rejojer Apr 10, 2026
2e1caf9
Merge pull request #14 from VectifyAI/dev
rejojer Apr 10, 2026
fde9b6d
feat: add SQLite-backed registry
kdush Apr 11, 2026
6dad765
feat: add SQLite backend and migration tests
kdush Apr 11, 2026
9436ad6
docs: document storage backend and migration
kdush Apr 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,13 @@ on:
jobs:
publish:
runs-on: ubuntu-latest
environment: pypi
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.2.2

- uses: actions/setup-python@v5
- uses: actions/setup-python@f677139bbe7f9c59b41e40162b753c062f5d49a3 # v5.2.0
with:
python-version: "3.12"

Expand All @@ -24,4 +25,4 @@ jobs:
run: python -m build

- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
uses: pypa/gh-action-pypi-publish@fb13cb306901256ace3dab689990e13a5550ffaa # release/v1.11.0
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,20 @@ Settings are initialized by `openkb init`, and stored in `.openkb/config.yaml`:
model: gpt-5.4 # LLM model (any LiteLLM-supported provider)
language: en # Wiki output language
pageindex_threshold: 20 # PDF pages threshold for PageIndex
storage_backend: sqlite # Storage backend: sqlite (default) or json
```

### Storage Backend

OpenKB supports two storage backends for the file hash registry:

| Backend | Description | Use Case |
|---------|-------------|----------|
| `sqlite` | SQLite database (default) | Better concurrency, scalability, recommended for production |
| `json` | JSON file | Simple, human-readable, for small installations |

Migration from JSON to SQLite happens automatically when you switch to `sqlite` backend and a `hashes.json` file exists. The JSON file is preserved but no longer used.

Model names use `provider/model` LiteLLM [format](https://docs.litellm.ai/docs/providers) (OpenAI models can omit the prefix):

| Provider | Model example |
Expand Down
Loading