marker-pdf 打开--use_llm选型 #3617

warlockedward · 2025-01-17T03:47:50Z

例行检查

我已确认目前没有类似 features
我已确认我已升级到最新版本
我已完整查看过项目 README，已确定现有版本无法满足需求
我理解并愿意跟进此 features，协助测试和提供反馈
我理解并认可上述内容，并理解项目维护者精力有限，不遵循规则的 features 可能会被无视或直接关闭

功能描述
在marker开源社区中发现有一个功能为--use_llm的设置
PDF 是一种棘手的格式，因此标记并不总是能完美地工作。以下是一些已知的限制，这些限制正在规划中：

Marker 只会转换块方程
表格格式并不总是 100% 正确 - 多行单元格有时会被分成多行。
表格转换效果不佳
布局非常复杂，带有嵌套表格和表单，可能无法正常工作
注意：传递--use_llm标志将基本解决所有这些问题。
应用场景

相关示例

class BaseLLMProcessor(BaseProcessor):
"""
A processor for using LLMs to convert blocks.
Attributes:
google_api_key (str):
The Google API key to use for the Gemini model.
Default is None.
model_name (str):
The name of the Gemini model to use.
Default is "gemini-1.5-flash".
max_retries (int):
The maximum number of retries to use for the Gemini model.
Default is 3.
max_concurrency (int):
The maximum number of concurrent requests to make to the Gemini model.
Default is 3.
timeout (int):
The timeout for requests to the Gemini model.
gemini_rewriting_prompt (str):
The prompt to use for rewriting text.
Default is a string containing the Gemini rewriting prompt.
use_llm (bool):
Whether to use the LLM model.
Default is False.
"""

google_api_key: Optional[str] = settings.GOOGLE_API_KEY
model_name: str = "gemini-1.5-flash"
use_llm: bool = False
max_retries: int = 3
max_concurrency: int = 3
timeout: int = 60
image_expansion_ratio: float = 0.01
gemini_rewriting_prompt = None
block_types = None

如何在fastgpt的docker环境中实现这样的功能，或者可以进行修改代码的文件，来实现基于openai api的多模态调用，非常感谢

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

marker-pdf 打开--use_llm选型 #3617

marker-pdf 打开--use_llm选型 #3617

warlockedward commented Jan 17, 2025

marker-pdf 打开--use_llm选型 #3617

marker-pdf 打开--use_llm选型 #3617

Comments

warlockedward commented Jan 17, 2025