Bug: Documents stored with raw JSON instead of parsed fields #234

Closed
opened 2026-01-24 22:12:05 +00:00 by jack · 2 comments
Owner

Description

Documents are stored with the entire JSON response in the content field instead of extracting and storing individual fields properly.

Current Behavior

The content column contains the full JSON blob:

{
  "bytes": 46325,
  "code": 200,
  "codeText": "OK",
  "result": "> ## Documentation Index...",
  "durationMs": 448,
  "url": "https://code.claude.com/docs/en/hooks"
}

This causes the UI to display "{" or "[" as the document title (first character of the JSON string).

Expected Behavior

When saving documents, the storage logic should:

  1. Parse the JSON response
  2. Extract url → store in url or title column
  3. Extract result → store in content column (the actual documentation text)
  4. Store metadata (bytes, code, durationMs) separately if needed

Location

The bug is in the document storage/caching logic, likely in:

  • MCP tool interceptor for Context7/WebFetch
  • Or wherever documents are saved to the documents table
## Description Documents are stored with the entire JSON response in the `content` field instead of extracting and storing individual fields properly. ## Current Behavior The `content` column contains the full JSON blob: ```json { "bytes": 46325, "code": 200, "codeText": "OK", "result": "> ## Documentation Index...", "durationMs": 448, "url": "https://code.claude.com/docs/en/hooks" } ``` This causes the UI to display `"{"` or `"["` as the document title (first character of the JSON string). ## Expected Behavior When saving documents, the storage logic should: 1. Parse the JSON response 2. Extract `url` → store in `url` or `title` column 3. Extract `result` → store in `content` column (the actual documentation text) 4. Store metadata (`bytes`, `code`, `durationMs`) separately if needed ## Location The bug is in the document storage/caching logic, likely in: - MCP tool interceptor for Context7/WebFetch - Or wherever documents are saved to the `documents` table
Author
Owner

Additional Context

The document content field is a JSON string like:

{
  "bytes": 46325,
  "code": 200,
  "codeText": "OK",
  "result": "> ## Documentation Index...",
  "durationMs": 448,
  "url": "https://code.claude.com/docs/en/hooks"
}

The UI should parse this JSON and extract a meaningful title:

  • For WebFetch documents: Use url field (e.g., "code.claude.com/docs/en/hooks")
  • For Context7 (Library Docs): Parse the content and extract library name or topic

Currently it seems like the UI is displaying content[0] (first character of the JSON string) instead of parsing the JSON and extracting a title field.

## Additional Context The document `content` field is a JSON string like: ```json { "bytes": 46325, "code": 200, "codeText": "OK", "result": "> ## Documentation Index...", "durationMs": 448, "url": "https://code.claude.com/docs/en/hooks" } ``` The UI should parse this JSON and extract a meaningful title: - For **WebFetch** documents: Use `url` field (e.g., "code.claude.com/docs/en/hooks") - For **Context7** (Library Docs): Parse the content and extract library name or topic Currently it seems like the UI is displaying `content[0]` (first character of the JSON string) instead of parsing the JSON and extracting a title field.
jack changed title from Bug: Documents view shows JSON characters instead of titles to Bug: Documents stored with raw JSON instead of parsed fields 2026-01-24 22:14:00 +00:00
Author
Owner

Context7 Format

Context7 returns an array of content blocks:

[
  {
    "type": "text",
    "text": "### SubagentStop Hook Event\n\nSource: https://github.com/anthropics/claude-code/blob/main/...\n\n..."
  }
]

Parsing needed:

  1. Extract text from each content block
  2. Parse the first header (e.g., ### SubagentStop Hook Event) as title
  3. Or extract the Source: URL as title
  4. Store the combined text content in the content column

Summary of formats to handle:

Source Format Title extraction
WebFetch {url, result, bytes, ...} Use url field
Context7 [{type: "text", text: "..."}] Parse first ### header or Source: URL
## Context7 Format Context7 returns an array of content blocks: ```json [ { "type": "text", "text": "### SubagentStop Hook Event\n\nSource: https://github.com/anthropics/claude-code/blob/main/...\n\n..." } ] ``` **Parsing needed:** 1. Extract `text` from each content block 2. Parse the first header (e.g., `### SubagentStop Hook Event`) as title 3. Or extract the `Source:` URL as title 4. Store the combined `text` content in the `content` column **Summary of formats to handle:** | Source | Format | Title extraction | |--------|--------|------------------| | WebFetch | `{url, result, bytes, ...}` | Use `url` field | | Context7 | `[{type: "text", text: "..."}]` | Parse first `###` header or `Source:` URL |
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
customable/claude-mem#234
No description provided.