[Request] 未来新知识库功能请求 #6054

BryceWG · 2025-02-12T10:54:37Z

🥰 需求描述

一般认为知识库是主要利用rag技术，与直接把文件作为上下文相对应。
我的设想是为知识库新增一个功能：在对话中选择附加文件时，可以选择直接把已经在知识库里的‘文件’作为上下文，当然也保留把‘知识库’作为上下文。相当于为知识库增加一个云盘的功能，让里面的文件增加一个快速调用的方式。

🧐 解决方案

在对话中选择附加文件时，可以选择直接把已经在知识库里的‘文件’作为上下文

📝 补充信息

No response

@arvinxx : 借该 issue 召集下大家的诉求，如果有对知识库目前不满意的地方，欢迎提出来，3月份开始做知识库 2.0 改造，你的每个诉求我都会看到

lobehubbot · 2025-02-12T10:54:55Z

👀 @BryceWG

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible.
Please make sure you have given us as much context as possible.
非常感谢您提交 issue。我们会尽快调查此事，并尽快回复您。请确保您已经提供了尽可能多的背景信息。

lobehubbot · 2025-02-12T10:54:58Z

🥰 Requirement description

It is generally believed that the knowledge base mainly uses rag technology, which corresponds to directly using files as context.
My idea is to add a new feature to the knowledge base: when selecting additional files in a conversation, you can choose to directly use the 'file' already in the knowledge base as the context, and of course, you also retain the 'knowledge base' as the context. It is equivalent to adding a cloud disk function to the knowledge base, allowing the files inside to be called quickly.

🧐 Solution

When selecting an attached file in a conversation, you can choose to directly use the 'file' already in the knowledge base as the context

📝 Supplementary information

No response

arvinxx · 2025-02-12T13:17:40Z

其实现在的交互就是支持的，但是由于之前是 RAG 的方式没法做全文注入所以效果不理想，这次2.0会把全文注入的能力加上，应该在一些需要全文引用的场景，效果会大大提升的

lobehubbot · 2025-02-12T13:17:52Z

In fact, the current interaction is supported, but since the previous RAG method could not be used for full text injection, the effect is not ideal. This time, 2.0 will add the ability to inject the full text, and it should be greatly improved in some scenarios that require full text citation. of

BryceWG · 2025-02-12T13:37:34Z

说到引用,能不能实现引用对话的内容?类似豆包的这个功能,选中对话中的一段文字,作为上下文:

lobehubbot · 2025-02-12T13:37:47Z

Speaking of quotations, can the content of the quotation dialogue be implemented? Similar to this function of Doubao, select a paragraph of text in the dialogue as the context:

Alencryenfo · 2025-02-13T01:38:15Z

same request
这个功能感觉很有用

lobehubbot · 2025-02-13T01:38:28Z

same request
This function feels very useful

git268 · 2025-02-13T02:57:45Z

补充一下，目前lobechat上传文件后还需要等待向量化才送给大模型。有些PPT或PDF文件本身有比较复杂的图表时会报错向量化失败。但是其他类似的平台例如cherry studio似乎是将整个文件发送给大模型，速度快而且读的内容更精准。

lobehubbot · 2025-02-13T02:58:03Z

To add, lobechat currently needs to wait for vectorization to be given to the big model after uploading the file. Some PPT or PDF files themselves have relatively complex charts and will report error vectorization failure. But other similar platforms such as cherry studio seem to send the entire file to the big model, which is fast and read more accurately.

BryceWG · 2025-02-13T03:00:06Z

我也用过cherry,只有少数服务商的api支持直接接收文件,按作者说的没有单独适配的api其实都是本地解析出文件内容再发送

lobehubbot · 2025-02-13T03:00:18Z

I have also used cherry. Only a few APIs of service providers support direct reception of files. According to the author's statement, APIs that do not have separate adaptations are actually parsed locally and then sent

Sun-drenched · 2025-02-14T05:26:48Z

1.建议接入Doc2X等一线文档解析API，提高知识库文档解析精度。
2.由于直接向API传文档普适性不够，建议给出（解析后）全原文索引的功能（可以像现在的深度思考栏一样，单独分栏显示，默认不展开）
3.允许创建、便捷共享（可以做到团队空间/公有、私有助手市场里/or直接导出压缩包等可以直接导入的文件）附带向量化及原文知识库的助手
4.手机端拥有和桌面端一致的知识库管理功能。

lobehubbot · 2025-02-14T05:27:01Z

It is recommended to connect to the front-line document analysis API such as Doc2X to improve the accuracy of document analysis of knowledge bases.
Since the universality of directly passing documents to the API is not enough, it is recommended to give (after parsing) the function of full original text index (can be displayed separately like the current deep thinking column, and will not be expanded by default)
Allow creation and convenient sharing (can directly export compressed packages and other files that can be imported directly in the team space/public and private assistant market) with vectorization and original knowledge base assistant
The mobile phone has a consistent knowledge base management function with the desktop.

yagev5 · 2025-02-15T11:32:40Z

希望后面版本中，知识库在设置里增加一个选择量化的模型的入口，方便用户选择自己想用的在线或本地模型构建知识库，帮助文档里，更改环境变量的方式，换了一下，有太麻烦了，有些模型又报错，后面知识库希望可以直接添加网页地址，自动量化网页数据到知识库

lobehubbot · 2025-02-15T11:32:52Z

I hope that in the later version, the knowledge base will add an entry to select the quantized model in the settings, so that users can choose the online or local model they want to use to build the knowledge base. In the help document, the way to change the environment variables, it has changed, it is too I'm in trouble. Some models reported an error again. I hope that the knowledge base can directly add the web address and automatically quantify the web page data to the knowledge base.

SAnBlog · 2025-02-17T07:58:32Z

期望可以在线创建文档,markdown格式在线编辑内容,保存后的数据可以手动向量化,在对话时可以勾选某个文档或者某个文档的目录,等同于AI笔记+对话

lobehubbot · 2025-02-17T07:58:46Z

I hope to create documents online, edit content online in markdown format, and save data can be manually vectorized. During conversation, you can check a document or a directory of a document, which is equivalent to AI notes + dialogue.

memset0 · 2025-02-18T14:31:40Z

对于 Gemini 这样的服务商，提供了文件上传的接口，希望在全文注入时可选直接调用这种接口以获得更好的性能

lobehubbot · 2025-02-18T14:31:51Z

For service providers like Gemini, they provide an interface for file upload, hoping to directly call this interface when injecting the full text to obtain better performance.

memset0 · 2025-02-18T14:34:01Z

个人建议可以在聊天界面的侧边栏增加一个 panel，里面有当前对话的文件/知识库列表及复选框，每次可选发送部分文件/在部分文件中进行 RAG 检索（或许是 NotebookLM 类似的交互体验）

lobehubbot · 2025-02-18T14:34:14Z

Personally, I suggest you add a panel to the sidebar of the chat interface, which contains the current conversation file/knowledge base list and check boxes. You can select some files/retrieve RAG in some files (perhaps similar to NotebookLM. Interactive experience)

rollby · 2025-02-19T03:11:20Z

建议知识库加上这些功能
1、支持创建文件夹，便于知识库内分类管理
2、支持在线创建文件或分段，便于补充内容
3、支持文件存档或者版本更新，例如我有一个联系表会不断更新，我能够直接覆盖旧版本的文件，或者对旧版本文件进行存档
4、支持团队管理、团队共享知识库
5、支持对知识库文件打tag，便于快速查找文件，或者便于检索

lobehubbot · 2025-02-19T03:11:32Z

Recommended knowledge base to add these features

Supports creating folders, which facilitates classification management in the knowledge base
Support online creation of files or segments to facilitate supplementary content
Support file archives or version updates. For example, I have a contact form that will be updated continuously. I can directly overwrite old version files, or archive old version files.
Support team management and team sharing knowledge base
Support tagging knowledge base files, which facilitates quick search of files or facilitates retrieval.

Steve235lab · 2025-02-19T22:37:50Z

现在对没有做过OCR处理的PDF分块会报错，希望对于这种文档可以内置一个ocrmypdf做下预处理

lobehubbot · 2025-02-19T22:38:04Z

Now there will be an error for PDF chunking that has not been processed by OCR. I hope that for this kind of document, you can build a ocrmypdf to pre-process it.

lobehubbot · 2025-02-26T06:59:38Z

If the knowledge base contains pictures, can the pictures be output when the answer is output?

xuzhouqqq · 2025-02-26T09:53:08Z

知识库里面，批量新增几百个文件，然后批量向量化，会存在某些文件向量化失败的情况。

能否在文件列表做个筛选，区分已经向量化和未向量化的文件，这样可以快速筛选失败的再次批量向量化

lobehubbot · 2025-02-26T09:53:21Z

In the knowledge base, if hundreds of new files are added in batches and then batch vectorization is used, some file vectorization will fail.

Can you filter the file list to distinguish between vectored and unvectored files, so that you can quickly filter failed batch vectorization again

51762344 · 2025-02-27T07:24:22Z

感谢，看到你这个评论才知道为啥一直分块不成功

lobehubbot · 2025-02-27T07:24:36Z

Thank you, I only know why I have been failing to divide the pieces after seeing this comment

lobehubbot · 2025-03-05T03:34:25Z

At present, our knowledge base still relies on traditional similarity search methods, which have certain limitations in handling complex queries and semantic understanding. In order to further improve the performance of the knowledge base and the accuracy of content replies, I suggest considering the introduction of knowledge graph technology.

Specifically, consider integrating the following projects into our knowledge base features:

LightRAG:
Project Introduction: LightRAG is an efficient knowledge graph embedding framework that can improve the accuracy and efficiency of knowledge retrieval through graph neural networks.
GitHub repository address: HKUDS/LightRAG
KAG (Knowledge Augmented Graph):
Project Introduction: KAG is an enhanced framework based on knowledge graphs, aiming to optimize information retrieval and question-and-answer systems through graph structure.
GitHub repository address: OpenSPG/KAG

Expected results:

Improve accuracy: Through the introduction of knowledge graphs, the system can more accurately understand the semantics of user queries and provide more accurate responses.
Enhanced semantic understanding: Knowledge graphs can capture relationships between entities and improve the system's ability to process complex queries.
Extended knowledge coverage: Using the relevance of the graph, you can expand the coverage of the knowledge base and provide more comprehensive information.

Implementation Suggestions:

Technical evaluation: A technical evaluation of LightRAG and KAG is carried out to determine their compatibility with existing systems.
Prototyping: Build a prototype system to verify the effect of knowledge graph embedding.
Step by step integration: After verifying the effect, gradually integrate the knowledge graph function into the main system.

I hope the team can consider this suggestion and jointly promote the optimization and upgrading of knowledge base functions.

ruanjunmin · 2025-03-05T03:34:34Z

目前，我们的知识库仍然依赖于传统的相似度检索方法，这在处理复杂查询和语义理解方面存在一定的局限性。为了进一步提升知识库的性能和内容回复的精准度，我建议考虑引入知识图谱技术。

具体而言，可以考虑将以下项目集成到我们的知识库功能中：

LightRAG：
项目简介：LightRAG 是一个高效的知识图谱嵌入框架，能够通过图神经网络提升知识检索的准确性和效率。
GitHub 仓库地址：HKUDS/LightRAG
KAG（Knowledge Augmented Graph）：
项目简介：KAG 是一个基于知识图谱的增强型框架，旨在通过图谱结构优化信息检索和问答系统。
GitHub 仓库地址：OpenSPG/KAG

预期效果：

提升精准度：通过知识图谱的引入，系统能更准确地理解用户查询的语义，提供更精准的回复。
增强语义理解：知识图谱能够捕捉实体间的关系，提升系统对复杂查询的处理能力。
扩展知识覆盖：利用图谱的关联性，可以扩展知识库的覆盖范围，提供更全面的信息。

实施建议：

技术评估：对LightRAG和KAG进行技术评估，确定其与现有系统的兼容性。
原型开发：搭建原型系统，验证知识图谱嵌入的效果。
逐步集成：在验证效果后，逐步将知识图谱功能集成到主系统中。

希望团队能考虑这一建议，共同推动知识库功能的优化和升级

lobehubbot · 2025-03-05T03:34:46Z

At present, our knowledge base still relies on traditional similarity search methods, which have certain limitations in handling complex queries and semantic understanding. In order to further improve the performance of the knowledge base and the accuracy of content replies, I suggest considering the introduction of knowledge graph technology.

Specifically, consider integrating the following projects into our knowledge base features:

LightRAG:
Project Introduction: LightRAG is an efficient knowledge graph embedding framework that can improve the accuracy and efficiency of knowledge retrieval through graph neural networks.
GitHub repository address: HKUDS/LightRAG
KAG (Knowledge Augmented Graph):
Project Introduction: KAG is an enhanced framework based on knowledge graphs, aiming to optimize information retrieval and question-and-answer systems through graph structure.
GitHub repository address: OpenSPG/KAG

Expected results:

Improve accuracy: Through the introduction of knowledge graphs, the system can more accurately understand the semantics of user queries and provide more accurate responses.
Enhanced semantic understanding: Knowledge graphs can capture relationships between entities and improve the system's ability to process complex queries.
Extended knowledge coverage: Using the relevance of the graph, you can expand the coverage of the knowledge base and provide more comprehensive information.

Implementation Suggestions:

Technical evaluation: A technical evaluation of LightRAG and KAG is carried out to determine their compatibility with existing systems.
Prototyping: Build a prototype system to verify the effect of knowledge graph embedding.
Step by step integration: After verifying the effect, gradually integrate the knowledge graph function into the main system.

I hope the team can consider this suggestion and jointly promote the optimization and upgrading of knowledge base functions.

k10876 · 2025-03-05T12:28:18Z

I hope that we can support direct file submission for capable multimodal models. The files contain PDFs or videos and other stuff.

Better if there's a tickbox "Send files without chunking" in the settings for respective models.

如果可以的话希望支持直接上传相关文件，特别是pdf文件。如果可以在模型设置里面对每个模型单独设置这个就更好了。

Lockeysama · 2025-03-06T03:35:14Z

希望有一个快捷功能，类似腾讯 IMA 的，快速把对话存储为笔记文件，然后可以选择手动或自动转为知识库~~另外要是能编辑笔记就更好了

lobehubbot · 2025-03-06T03:35:27Z

I hope there is a shortcut function similar to Tencent IMA, which can quickly store the conversation as a note file, and then you can choose to manually or automatically convert it to a knowledge base~~ In addition, it would be better if you can edit the notes.

bbbugg · 2025-03-09T10:47:36Z

多个文件一起提问，识别不到所有文件啊，只能识别部分文件的分块来回答。分开单个文件提问是没问题的

lobehubbot · 2025-03-09T10:47:50Z

Ask multiple files together, but all files cannot be identified. Only some files can be identified to answer in chunks. It's OK to ask questions separately in a single file

screw-hand · 2025-03-10T12:34:56Z

场景描述

环境配置

部署方式：自托管（数据库版本）
使用场景：高频使用"随意对话"图片上传功能

当前痛点

文件预览局限
- 仅支持列表视图（Grid 模式），无法快速浏览图片内容
- 批量管理时需逐一点击预览确认内容（文件名重复率高）
存储机制问题
- 所有对话场景上传的图片默认持久化到知识库
- 对话删除后相关文件仍保留在知识库
- 长期使用后产生大量需要手动清理的冗余文件

功能建议

视图模式增强
- 新增瀑布流（Masonry）图片预览模式
- 支持缩略图尺寸调节
存储机制优化
- 在设置新增选项：自动保存对话场景上传文件到知识库（默认开启）
- 添加批量管理功能：
  - 按时间范围筛选
  - 按会话来源筛选
  - 多选删除操作

附加信息

当前文件管理界面示例：

优先级说明

若该需求当前优先级较低，我会在知识库重构后，尝试自己实现

最后：感谢owner开发了lobe，对我们AI重度用户很有帮助！！

lobehubbot · 2025-03-10T12:35:13Z

Scene Description

Environment Configuration

Deployment method: self-hosted (database version)
Usage scenario: Use the "Random Dialogue" image upload function at high frequency

Current pain points

File preview limitations
- Only list view (Grid mode) is supported, and the image content cannot be viewed quickly
- During batch management, you need to click preview to confirm the content one by one (the file name duplication rate is high)
Storage mechanism issues
- All images uploaded by dialogue scenes are persisted to the knowledge base by default
- The relevant files remain in the knowledge base after the dialogue is deleted
- After long-term use, a large number of redundant files needing manual cleaning

Functional Suggestions

View mode enhancement
- Added the waterfall (Masonry) image preview mode
- Support thumbnail size adjustment
Storage mechanism optimization
- Added options in Settings: Automatically save the dialogue scene to upload files to the knowledge base (on by default)
- Add batch management function:
  - Filter by time range
  - Filter by session source
  - Multiple choice deletion operation

Additional Information

Example of the current file management interface:

Priority Description

If the current priority of this requirement is low, I will try to implement it myself after refactoring the knowledge base.

Finally: Thanks to owner for developing lobe, which is very helpful to our heavy AI users! !

m0eak · 2025-03-10T13:17:44Z

赞同，如果上传图片时可以自动按对话的助手名称加日期进行命名就更好了，不然全是image头有点大，还不能缩略图查看

lobehubbot · 2025-03-10T13:17:59Z

Agree, it would be better if you can automatically name the assistant name of the conversation and the date when uploading the image. Otherwise, the image header is a bit big and you can't view the thumbnail image.

Jean201093266 · 2025-03-11T03:58:28Z

建议支持自定义向量化工具

J4gQBqqR · 2025-03-14T10:59:33Z

赞同，现在有一堆和RAG相关的BUG，希望在2.的时候批量解决一下

lobehubbot · 2025-03-14T10:59:44Z

Agree, there are now a bunch of RAG-related bugs, I hope it will be solved in batches in 2.

FFatTiger · 2025-03-15T16:46:29Z

🥰 需求描述
希望能增加一个串行执行向量化知识库文件的按钮？如果文件太多，一次性并行发起向量化基本上没哪个api平台会不限制，自己一个一个选向量化等待结束再继续又很麻烦。

🧐 解决方案
增加一个串行向量化知识库文件的按钮

📝 补充信息
No response

lobehubbot · 2025-03-15T16:46:38Z

🥰 Requirement description
Hopefully add a button to execute vectorized knowledge base files in serial? If there are too many files, there is basically no API platform that initiates vectorization in one go. It is troublesome to choose vectorization one by one and wait for it to end before continuing.

🧐 Solution
Add a button to serial vectorized knowledge base file

📝 Supplementary information
No response

ImSingee · 2025-03-17T04:31:02Z

希望可以支持索引 web 数据到知识库中

原因：Cursor 有一个很好的功能是 @docs 用于在文档中搜寻，而 docs 可以通过输入一个文档的入口页面来自动索引，希望 lobechat 也可以支持这个功能

lobehubbot · 2025-03-17T04:31:14Z

Hopefully, it can support indexing web data into the knowledge base

Reason: A good function of Cursor is that @docs is used to search in documents, and docs can be automatically indexed by entering the entry page of a document. Hopefully lobechat can also support this function

chiheye · 2025-03-21T12:16:13Z

希望作者支持对数据库分类，支持二次重排.

18651619390 · 2025-03-24T06:43:12Z

请问知识库2.0的开发现在是什么进度了？有时间表吗

lobehubbot · 2025-03-24T06:43:22Z

What is the current progress in the development of Knowledge Base 2.0? Is there a timetable

ahuai · 2025-03-27T03:07:49Z

最近几天很火的gpt-4o Visualize ideas and concepts功能希望能接入进来
https://weibo.com/1727858283/PkkfKyNwy

smcdl · 2025-04-05T06:52:29Z

建议可以手动指定嵌入模型和从排序模型，配置文件方式不方便，并且覆盖模型提供商有限

lobehubbot · 2025-04-05T06:52:39Z

It is recommended that you can manually specify the embedded model and sort the model, the configuration file method is inconvenient, and the coverage model provider is limited.

BryceWG added the 🌠 Feature Request New feature or request | 特性与建议 label Feb 12, 2025

dosubot bot added the files 上传文件/知识库 label Feb 12, 2025

arvinxx added this to the Knowledgebase 2.0 milestone Feb 12, 2025

arvinxx pinned this issue Feb 13, 2025

[Request] 未来新知识库功能请求 #6054

[Request] 未来新知识库功能请求 #6054

Comments

BryceWG commented Feb 12, 2025 • edited by arvinxx Loading

🥰 需求描述

🧐 解决方案

📝 补充信息

lobehubbot commented Feb 12, 2025

lobehubbot commented Feb 12, 2025

🥰 Requirement description

🧐 Solution

📝 Supplementary information

arvinxx commented Feb 12, 2025

lobehubbot commented Feb 12, 2025

BryceWG commented Feb 12, 2025

lobehubbot commented Feb 12, 2025

Alencryenfo commented Feb 13, 2025

lobehubbot commented Feb 13, 2025

git268 commented Feb 13, 2025

lobehubbot commented Feb 13, 2025

BryceWG commented Feb 13, 2025

lobehubbot commented Feb 13, 2025

Sun-drenched commented Feb 14, 2025

lobehubbot commented Feb 14, 2025

yagev5 commented Feb 15, 2025

lobehubbot commented Feb 15, 2025

SAnBlog commented Feb 17, 2025

lobehubbot commented Feb 17, 2025

memset0 commented Feb 18, 2025

lobehubbot commented Feb 18, 2025

memset0 commented Feb 18, 2025

lobehubbot commented Feb 18, 2025

rollby commented Feb 19, 2025

lobehubbot commented Feb 19, 2025

Steve235lab commented Feb 19, 2025

lobehubbot commented Feb 19, 2025

lobehubbot commented Feb 26, 2025

xuzhouqqq commented Feb 26, 2025

lobehubbot commented Feb 26, 2025

51762344 commented Feb 27, 2025

lobehubbot commented Feb 27, 2025

lobehubbot commented Mar 5, 2025

ruanjunmin commented Mar 5, 2025

lobehubbot commented Mar 5, 2025

k10876 commented Mar 5, 2025

Lockeysama commented Mar 6, 2025

lobehubbot commented Mar 6, 2025

bbbugg commented Mar 9, 2025

lobehubbot commented Mar 9, 2025

screw-hand commented Mar 10, 2025

场景描述

功能建议

附加信息

优先级说明

lobehubbot commented Mar 10, 2025

Scene Description

Functional Suggestions

Additional Information

Priority Description

m0eak commented Mar 10, 2025

场景描述

功能建议

附加信息

优先级说明

lobehubbot commented Mar 10, 2025

Scene Description

Functional Suggestions

Additional information

Priority Description

Jean201093266 commented Mar 11, 2025

J4gQBqqR commented Mar 14, 2025

lobehubbot commented Mar 14, 2025

FFatTiger commented Mar 15, 2025

lobehubbot commented Mar 15, 2025

ImSingee commented Mar 17, 2025

lobehubbot commented Mar 17, 2025

chiheye commented Mar 21, 2025

18651619390 commented Mar 24, 2025

lobehubbot commented Mar 24, 2025

ahuai commented Mar 27, 2025

BryceWG commented Feb 12, 2025 •

edited by arvinxx

Loading