5 Commits

Author SHA1 Message Date
6807371c5e feat: add content filter and code processing module (v0.3.0)
- Add content filter module (internal/content/)
- Implement basic character filtering (control chars, line breaks, symbols)
- Implement code block and inline code detection
- Implement comment detection for 30+ languages (JS/Python/Go/HTML/etc)
- Add go-enry dependency for intelligent language detection
- Add SkipKeywords config option (default: TODO/FIXME/HACK/XXX/etc)
- Integrate content processing into Translator
- Update config.yaml with skip_keywords
2026-03-29 18:41:25 +08:00
1bce2d9c7a merge: v0.2.0 language support and onboard wizard 2026-03-29 01:30:59 +08:00
24ba405d55 feat: add language support and onboard configuration wizard (v0.2.0)
- Add language code intelligent parsing module (internal/lang)
- Support --lang parameter for target language specification
- Support multiple language code formats (BCP47, aliases, Chinese names)
- Implement interactive onboard configuration wizard
- Update Config struct with language fields
- Add survey library dependency for interactive UI
- Improve CLI command interface
- Add comprehensive unit tests for language module
- Update documentation (AGENTS.md, changelog.md, taolun.md, memory.md)

Supported language codes:
- Standard: zh-CN, zh-TW, en-US, en-GB, ja, ko, es, fr, de
- Aliases: cn, en, jp, kr, es, fr, de
- Chinese names: chinese, english, japanese

Commands:
- yoyo "Hello world" - basic translation
- yoyo --lang=cn "Hello world" - specify target language
- yoyo onboard - start configuration wizard
- yoyo onboard --force - force reconfiguration

Version: 0.2.0
2026-03-29 01:30:42 +08:00
e18fbad839 merge: v0.0.3 environment variable loading fix 2026-03-28 23:31:33 +08:00
7c8a924984 merge: v0.0.2 core architecture implementation 2026-03-28 23:27:19 +08:00
15 changed files with 1789 additions and 14 deletions

View File

@@ -761,7 +761,64 @@ A: 使用指数退避重试,并在`internal/api/`中实现限流器。
### Q: 如何支持更多语言?
A: 在配置文件中添加语言映射,并更新翻译逻辑。
## 语言代码处理
### 支持的语言代码格式
项目支持多种语言代码格式,通过 `internal/lang` 模块处理:
1. **标准BCP47格式**: `zh-CN`, `zh-TW`, `en-US`, `en-GB`, `ja`, `ko`
2. **简短别名**: `cn`(中文), `en`(英文), `jp`(日文), `kr`(韩文) 等
3. **中文名称**: `chinese`(中文), `english`(英文), `japanese`(日文) 等
### 语言解析函数
```go
// 解析语言代码
lang.ParseLanguageCode("cn") // 返回 "zh-CN"
lang.ParseLanguageCode("en") // 返回 "en-US"
lang.ParseLanguageCode("zh-TW") // 返回 "zh-TW"
// 获取语言名称(用于显示)
lang.GetLanguageName("zh-CN") // 返回 "中文(简体)"
lang.GetLanguageName("en-US") // 返回 "English (US)"
```
## Onboard配置向导
### 配置流程
1. 选择主要翻译厂商
2. 配置厂商API密钥、HOST、模型
3. 设置全局配置(默认语言、超时)
4. 保存配置到 `configs/config.yaml`
### 使用方法
```bash
yoyo onboard # 启动配置向导
yoyo onboard --force # 强制重新配置
```
### 配置向导实现
- 使用 `github.com/AlecAivazis/survey/v2` 实现交互式界面
- 支持厂商选择、API配置、语言设置
- 生成标准YAML配置文件
## 分阶段迁移策略
### 第一阶段:开发阶段(当前)
- API密钥存储在 `.env` 文件
- 复杂配置存储在 `configs/config.yaml`
- 支持环境变量替换
### 第二阶段:上线前
- 实现配置文件路径查找机制
- 支持用户配置目录 `~/.config/yoo/yoo.yml`
- 提供配置迁移工具
### 第三阶段:最终优化
- 移除对 `.env` 文件依赖
- 完全使用配置文件
## 参考资源
- [Effective Go](https://go.dev/doc/effective_go)
- [Go Code Review Comments](https://github.com/golang/go/wiki/CodeReviewComments)
- [Go Style Guide](https://google.github.io/styleguide/go/)
- [Survey库文档](https://github.com/AlecAivazis/survey)

View File

@@ -32,6 +32,92 @@
## 版本历史
### 0.3.0 (2026-03-29) - 内容过滤与代码处理
**类型**: 功能版本
**状态**: 开发中
**变更内容**:
- ✅ 添加内容过滤模块 (internal/content/)
- ✅ 实现基础字符过滤(移除控制字符、规范化换行符、截断超长符号)
- ✅ 实现代码块和行内代码识别
- ✅ 实现代码注释智能识别(支持 JS/TS/Java/Python/Go/HTML 等 30+ 语言)
- ✅ 添加 go-enry 依赖实现编程语言智能检测
- ✅ 添加 SkipKeywords 配置项,默认保留 TODO/FIXME/HACK 等关键词不翻译
- ✅ 集成内容处理到 Translator 模块
**新增文件**:
- `internal/content/content.go` - 模块入口
- `internal/content/filter.go` - 基础字符过滤
- `internal/content/parser.go` - 内容解析器和语言检测
**配置更新**:
- `configs/config.yaml` 新增 `skip_keywords` 配置项
- 支持用户自定义不翻译的关键词列表
**使用示例**:
```bash
# 翻译包含代码的文档,自动识别代码和注释
yoyo "这是一个文档 ```js // TODO: fix this ```"
# 代码块保持不变,只翻译注释中的词汇
# TODO: 修复这个
```
**讨论记录**:
- [内容过滤与代码处理设计](taolun.md#内容过滤与代码处理设计)
**下一步**:
- 实现更多厂商火山引擎、国家超算、Qwen、OpenAI兼容
- 添加配置文件路径查找机制
- 实现配置文件迁移工具
- 完善错误处理和用户体验
### 0.2.0 (2026-03-29) - 语言支持和配置向导
**类型**: 功能版本
**状态**: 开发中
**变更内容**:
- ✅ 添加语言代码智能解析模块 (internal/lang)
- ✅ 支持 `--lang` 参数指定目标语言
- ✅ 支持多种语言代码格式标准BCP47、别名、中文名称
- ✅ 实现 onboard 交互式配置向导
- ✅ 更新配置结构添加语言字段
- ✅ 添加 survey 库依赖用于交互式界面
- ✅ 改进CLI命令行接口
- ✅ 添加语言模块单元测试
**新增文件**:
- `internal/lang/lang.go` - 语言代码解析模块
- `internal/lang/lang_test.go` - 语言模块测试
- `internal/onboard/onboard.go` - 配置向导实现
**支持的语言代码**:
- 标准格式: zh-CN, zh-TW, en-US, en-GB, ja, ko, es, fr, de 等
- 简短别名: cn(中文), en(英文), jp(日文), kr(韩文) 等
- 中文名称: chinese(中文), english(英文), japanese(日文) 等
**使用示例**:
```bash
# 基本翻译
yoyo "Hello world"
yoyo --lang=cn "Hello world"
yoyo --lang=en "你好世界"
yoyo --lang=zh-TW "Hello world"
# 配置向导
yoyo onboard
yoyo onboard --force
```
**讨论记录**:
- [语言代码解析设计](taolun.md#语言代码解析设计)
- [onboard配置向导](taolun.md#onboard配置向导)
**下一步**:
- 实现更多厂商火山引擎、国家超算、Qwen、OpenAI兼容
- 添加配置文件路径查找机制
- 实现配置文件迁移工具
- 完善错误处理和用户体验
### 0.0.3 (2026-03-29) - 环境变量加载修复
**类型**: 修复版本
**状态**: 开发中

View File

@@ -4,6 +4,8 @@
default_provider: "siliconflow"
default_model: "gpt-3.5-turbo"
timeout: 30
default_source_lang: "auto" # 默认源语言auto为自动检测
default_target_lang: "zh-CN" # 默认目标语言(简体中文)
providers:
siliconflow:

10
go.mod
View File

@@ -3,6 +3,16 @@ module github.com/titor/fanyi
go 1.26.1
require (
github.com/AlecAivazis/survey/v2 v2.3.7 // indirect
github.com/go-enry/go-enry/v2 v2.9.5 // indirect
github.com/go-enry/go-oniguruma v1.2.1 // indirect
github.com/joho/godotenv v1.5.1 // indirect
github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 // indirect
github.com/mattn/go-colorable v0.1.2 // indirect
github.com/mattn/go-isatty v0.0.8 // indirect
github.com/mgutz/ansi v0.0.0-20170206155736-9520e82c474b // indirect
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f // indirect
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211 // indirect
golang.org/x/text v0.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

57
go.sum
View File

@@ -1,5 +1,62 @@
github.com/AlecAivazis/survey/v2 v2.3.7 h1:6I/u8FvytdGsgonrYsVn2t8t4QiRnh6QSTqkkhIiSjQ=
github.com/AlecAivazis/survey/v2 v2.3.7/go.mod h1:xUTIdE4KCOIjsBAE1JYsUPoCqYdZ1reCfTwbto0Fduo=
github.com/Netflix/go-expect v0.0.0-20220104043353-73e0943537d2/go.mod h1:HBCaDeC1lPdgDeDbhX8XFpy1jqjK0IBG8W5K+xYqA0w=
github.com/creack/pty v1.1.17/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/go-enry/go-enry/v2 v2.9.5 h1:HPhAQQHYwJgihL2PxBZiUMFWiROsGwOBdB6/D8zCUhY=
github.com/go-enry/go-enry/v2 v2.9.5/go.mod h1:9yrj4ES1YrbNb1Wb7/PWYr2bpaCXUGRt0uafN0ISyG8=
github.com/go-enry/go-oniguruma v1.2.1 h1:k8aAMuJfMrqm/56SG2lV9Cfti6tC4x8673aHCcBk+eo=
github.com/go-enry/go-oniguruma v1.2.1/go.mod h1:bWDhYP+S6xZQgiRL7wlTScFYBe023B6ilRZbCAD5Hf4=
github.com/hinshun/vt10x v0.0.0-20220119200601-820417d04eec/go.mod h1:Q48J4R4DvxnHolD5P8pOtXigYlRuPLGl6moFx3ulM68=
github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 h1:Z9n2FFNUXsshfwJMBgNA0RU6/i7WVaAegv3PtuIHPMs=
github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51/go.mod h1:CzGEWj7cYgsdH8dAjBGEr58BoE7ScuLd+fwFZ44+/x8=
github.com/mattn/go-colorable v0.1.2 h1:/bC9yWikZXAL9uJdulbSfyVNIR3n3trXl+v8+1sx8mU=
github.com/mattn/go-colorable v0.1.2/go.mod h1:U0ppj6V5qS13XJ6of8GYAs25YV2eR4EVcfRqFIhoBtE=
github.com/mattn/go-isatty v0.0.8 h1:HLtExJ+uU2HOZ+wI0Tt5DtUDrx8yhUqDcp7fYERX4CE=
github.com/mattn/go-isatty v0.0.8/go.mod h1:Iq45c/XA43vh69/j3iqttzPXn0bhXyGjM0Hdxcsrc5s=
github.com/mgutz/ansi v0.0.0-20170206155736-9520e82c474b h1:j7+1HpAFS1zy5+Q4qx1fWh90gTKwiN4QCGoY9TWyyO4=
github.com/mgutz/ansi v0.0.0-20170206155736-9520e82c474b/go.mod h1:01TrycV0kFyexm33Z7vhZRXopbI8J3TDReVlkTgMUxE=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190222072716-a9d3bda3a223/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f h1:v4INt8xihDGvnrfjMDVXGxw9wrfxYyCjk0KbXjhR55s=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211 h1:JGgROgKl9N8DuW20oFS5gxc+lE67/N3FcwmBPMe7ArY=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.4.0 h1:BrVqGRd7+k1DiOgtnFvAkoQEWQvBc25ouMJM6429SFg=
golang.org/x/text v0.4.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -15,12 +15,17 @@ type Config struct {
DefaultProvider string `yaml:"default_provider"`
DefaultModel string `yaml:"default_model"`
Timeout int `yaml:"timeout"` // 秒
DefaultSourceLang string `yaml:"default_source_lang"` // 默认源语言auto为自动检测
DefaultTargetLang string `yaml:"default_target_lang"` // 默认目标语言
// 厂商配置
Providers map[string]ProviderConfig `yaml:"providers"`
// Prompt配置
Prompts map[string]string `yaml:"prompts"`
// 内容过滤配置
SkipKeywords []string `yaml:"skip_keywords"` // 不翻译的关键词
}
// ProviderConfig 厂商配置
@@ -97,6 +102,12 @@ func (c *Config) setDefaults() {
if c.DefaultModel == "" {
c.DefaultModel = "gpt-3.5-turbo"
}
if c.DefaultSourceLang == "" {
c.DefaultSourceLang = "auto" // 自动检测
}
if c.DefaultTargetLang == "" {
c.DefaultTargetLang = "zh-CN" // 默认翻译为简体中文
}
// 为每个厂商设置默认值
for name, provider := range c.Providers {
@@ -113,6 +124,16 @@ func (c *Config) setDefaults() {
if c.Prompts == nil {
c.Prompts = make(map[string]string)
}
// 设置默认关键词
if c.SkipKeywords == nil {
c.SkipKeywords = []string{
"TODO", "FIXME", "HACK", "XXX", "NOTE",
"BUG", "WARN", "IMPORTANT", "TODO:",
"FIXME:", "HACK:", "XXX:", "NOTE:",
"BUG:", "WARN:", "IMPORTANT:",
}
}
}
// GetProviderConfig 获取指定厂商的配置
@@ -190,6 +211,8 @@ func (c *Config) String() string {
builder.WriteString(fmt.Sprintf("DefaultProvider: %s\n", c.DefaultProvider))
builder.WriteString(fmt.Sprintf("DefaultModel: %s\n", c.DefaultModel))
builder.WriteString(fmt.Sprintf("Timeout: %d seconds\n", c.Timeout))
builder.WriteString(fmt.Sprintf("DefaultSourceLang: %s\n", c.DefaultSourceLang))
builder.WriteString(fmt.Sprintf("DefaultTargetLang: %s\n", c.DefaultTargetLang))
builder.WriteString("Providers:\n")
for name, provider := range c.Providers {
builder.WriteString(fmt.Sprintf(" %s: enabled=%v, model=%s\n", name, provider.Enabled, provider.Model))

View File

@@ -0,0 +1,17 @@
package content
import (
"github.com/go-enry/go-enry/v2"
)
const (
Version = "1.0.0"
)
func DetectLanguage(text string) string {
return enry.GetLanguage("", []byte(text))
}
func Filter(text string) string {
return FilterBasic(text, nil)
}

View File

@@ -0,0 +1,55 @@
package content
import (
"regexp"
"strings"
)
type FilterOptions struct {
RemoveControlChars bool
NormalizeLineBreaks bool
MaxConsecutiveSymbols int
}
var defaultFilterOptions = &FilterOptions{
RemoveControlChars: true,
NormalizeLineBreaks: true,
MaxConsecutiveSymbols: 20,
}
var controlCharsRegex = regexp.MustCompile(`[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]`)
func FilterBasic(text string, opts *FilterOptions) string {
if opts == nil {
opts = defaultFilterOptions
}
result := text
if opts.RemoveControlChars {
result = controlCharsRegex.ReplaceAllString(result, "")
}
if opts.NormalizeLineBreaks {
result = strings.ReplaceAll(result, "\r\n", "\n")
result = strings.ReplaceAll(result, "\r", "\n")
}
if opts.MaxConsecutiveSymbols > 0 {
result = truncateConsecutiveSymbols(result, opts.MaxConsecutiveSymbols)
}
return result
}
func truncateConsecutiveSymbols(text string, maxCount int) string {
symbols := []string{"=", "-", "_", "*", "#", "~", "`", "."}
for _, symbol := range symbols {
pattern := regexp.MustCompile(`(?` + `(` + symbol + `){` + string(rune(maxCount+1)) + `,})`)
replacement := strings.Repeat(symbol, maxCount)
text = pattern.ReplaceAllString(text, replacement)
}
return text
}

453
internal/content/parser.go Normal file
View File

@@ -0,0 +1,453 @@
package content
import (
"fmt"
"regexp"
"strings"
"github.com/go-enry/go-enry/v2"
)
type SegmentType int
const (
SegmentTypeText SegmentType = iota
SegmentTypeCodeBlock
SegmentTypeInlineCode
SegmentTypeComment
)
func (t SegmentType) String() string {
switch t {
case SegmentTypeText:
return "text"
case SegmentTypeCodeBlock:
return "code_block"
case SegmentTypeInlineCode:
return "inline_code"
case SegmentTypeComment:
return "comment"
default:
return "unknown"
}
}
type ContentSegment struct {
Type SegmentType
Content string
Translated string
Language string
IsComment bool
StartPos int
EndPos int
}
type ParseResult struct {
Segments []ContentSegment
SourceLang string
HasCode bool
}
type languageCommentPatterns struct {
LineComment string
BlockComment []string
}
var languagePatterns = map[string]languageCommentPatterns{
"javascript": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"typescript": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"java": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"kotlin": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"scala": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"c": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"cpp": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"c#": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"go": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"rust": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"php": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"swift": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"objective-c": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"scss": {LineComment: `//`, BlockComment: []string{`/*`, `*/`}},
"css": {LineComment: ``, BlockComment: []string{`/*`, `*/`}},
"less": {LineComment: ``, BlockComment: []string{`/*`, `*/`}},
"html": {LineComment: ``, BlockComment: []string{`<!--`, `-->`}},
"xml": {LineComment: ``, BlockComment: []string{`<!--`, `-->`}},
"sql": {LineComment: `--`, BlockComment: []string{`/*`, `*/`}},
"python": {LineComment: `#`, BlockComment: []string{`"""`, `"""`}},
"ruby": {LineComment: `#`, BlockComment: []string{`=begin`, `=end`}},
"shell": {LineComment: `#`, BlockComment: []string{}},
"bash": {LineComment: `#`, BlockComment: []string{}},
"powershell": {LineComment: `#`, BlockComment: []string{`<#`, `#>`}},
"yaml": {LineComment: `#()`, BlockComment: []string{}},
"json": {LineComment: ``, BlockComment: []string{}},
"markdown": {LineComment: ``, BlockComment: []string{}},
"vue": {LineComment: `//()`, BlockComment: []string{`/*`, `*/`, `<!--`, `-->`}},
"svelte": {LineComment: `//()`, BlockComment: []string{`/*`, `*/`}},
"jsx": {LineComment: `//()`, BlockComment: []string{`/*`, `*/`}},
"tsx": {LineComment: `//()`, BlockComment: []string{`/*`, `*/`}},
}
var defaultPatterns = languageCommentPatterns{
LineComment: `//`,
BlockComment: []string{`/*`, `*/`},
}
type Parser struct {
skipKeywords []string
fallbackLang string
}
func NewParser(skipKeywords []string) *Parser {
if skipKeywords == nil {
skipKeywords = []string{
"TODO", "FIXME", "HACK", "XXX", "NOTE",
"BUG", "WARN", "IMPORTANT", "TODO:",
"FIXME:", "HACK:", "XXX:", "NOTE:",
"BUG:", "WARN:", "IMPORTANT:",
}
}
return &Parser{
skipKeywords: skipKeywords,
fallbackLang: "javascript",
}
}
func (p *Parser) Parse(text string) (*ParseResult, error) {
result := &ParseResult{
Segments: []ContentSegment{},
}
detectedLang := p.detectLanguage(text)
result.SourceLang = detectedLang
segments := p.splitIntoSegments(text, result.SourceLang)
for _, seg := range segments {
if seg.Type == SegmentTypeCodeBlock || seg.Type == SegmentTypeInlineCode {
result.HasCode = true
}
result.Segments = append(result.Segments, seg)
}
return result, nil
}
func (p *Parser) detectLanguage(text string) string {
lines := strings.Split(text, "\n")
var codeLines []string
inCodeBlock := false
for _, line := range lines {
trimmed := strings.TrimSpace(line)
if strings.HasPrefix(trimmed, "```") {
inCodeBlock = !inCodeBlock
continue
}
if inCodeBlock && trimmed != "" {
codeLines = append(codeLines, trimmed)
}
}
if len(codeLines) == 0 {
for _, line := range lines {
if strings.TrimSpace(line) != "" {
codeLines = append(codeLines, line)
}
}
}
if len(codeLines) == 0 {
return p.fallbackLang
}
sample := strings.Join(codeLines[:min(len(codeLines), 10)], "\n")
lang := enry.GetLanguage("", []byte(sample))
if lang == "" {
return p.fallbackLang
}
return strings.ToLower(lang)
}
func (p *Parser) splitIntoSegments(text string, lang string) []ContentSegment {
segments := []ContentSegment{}
codeBlockPattern := regexp.MustCompile("(?s)```[\\s\\S]*?^```|`[^`]+`")
matches := codeBlockPattern.FindAllStringIndex(text, -1)
if len(matches) == 0 {
segments = append(segments, ContentSegment{
Type: SegmentTypeText,
Content: text,
StartPos: 0,
EndPos: len(text),
})
return segments
}
lastEnd := 0
for _, match := range matches {
start, end := match[0], match[1]
if start > lastEnd {
textPart := text[lastEnd:start]
textSegments := p.parseTextContent(textPart, lang)
segments = append(segments, textSegments...)
}
content := text[start:end]
isInline := len(content) > 0 && content[0] == '`' && (len(content) == 1 || content[len(content)-1] == '`')
if strings.HasPrefix(content, "```") {
segments = append(segments, ContentSegment{
Type: SegmentTypeCodeBlock,
Content: content,
Language: p.detectCodeBlockLang(content),
StartPos: start,
EndPos: end,
})
} else if isInline {
segments = append(segments, ContentSegment{
Type: SegmentTypeInlineCode,
Content: content,
Language: lang,
StartPos: start,
EndPos: end,
})
}
lastEnd = end
}
if lastEnd < len(text) {
textPart := text[lastEnd:]
textSegments := p.parseTextContent(textPart, lang)
segments = append(segments, textSegments...)
}
return segments
}
func (p *Parser) parseTextContent(text string, lang string) []ContentSegment {
segments := []ContentSegment{}
langPatterns := getLanguagePatterns(lang)
if langPatterns.SingleLine == "" && len(langPatterns.MultiLine) == 0 {
segments = append(segments, ContentSegment{
Type: SegmentTypeText,
Content: text,
Language: lang,
StartPos: 0,
EndPos: len(text),
})
return segments
}
commentPatterns := p.buildCommentRegex(langPatterns)
if commentPatterns == nil {
segments = append(segments, ContentSegment{
Type: SegmentTypeText,
Content: text,
Language: lang,
StartPos: 0,
EndPos: len(text),
})
return segments
}
matches := commentPatterns.FindAllStringIndex(text, -1)
if len(matches) == 0 {
segments = append(segments, ContentSegment{
Type: SegmentTypeText,
Content: text,
Language: lang,
StartPos: 0,
EndPos: len(text),
})
return segments
}
lastEnd := 0
for _, match := range matches {
start, end := match[0], match[1]
if start > lastEnd {
segments = append(segments, ContentSegment{
Type: SegmentTypeText,
Content: text[lastEnd:start],
Language: lang,
StartPos: lastEnd,
EndPos: start,
})
}
segments = append(segments, ContentSegment{
Type: SegmentTypeComment,
Content: text[start:end],
IsComment: true,
Language: lang,
StartPos: start,
EndPos: end,
})
lastEnd = end
}
if lastEnd < len(text) {
segments = append(segments, ContentSegment{
Type: SegmentTypeText,
Content: text[lastEnd:],
Language: lang,
StartPos: lastEnd,
EndPos: len(text),
})
}
return segments
}
type languageCommentRegex struct {
SingleLine string
MultiLine []struct {
Start string
End string
}
}
func (p *Parser) buildCommentRegex(patterns languageCommentRegex) *regexp.Regexp {
var parts []string
if patterns.SingleLine != "" {
parts = append(parts, patterns.SingleLine+`.*$`)
}
for _, multi := range patterns.MultiLine {
if multi.Start != "" && multi.End != "" {
escapedStart := regexp.QuoteMeta(multi.Start)
escapedEnd := regexp.QuoteMeta(multi.End)
parts = append(parts, escapedStart+`[\s\S]*?`+escapedEnd)
}
}
if len(parts) == 0 {
return nil
}
pattern := `(?m)` + strings.Join(parts, "|")
return regexp.MustCompile(pattern)
}
func getLanguagePatterns(lang string) languageCommentRegex {
patterns, ok := languagePatterns[lang]
if !ok {
patterns = defaultPatterns
}
result := languageCommentRegex{
SingleLine: patterns.LineComment,
}
for _, bc := range patterns.BlockComment {
if len(bc) >= 2 {
result.MultiLine = append(result.MultiLine, struct {
Start string
End string
}{Start: bc[:len(bc)/2], End: bc[len(bc)/2:]})
}
}
return result
}
func (p *Parser) detectCodeBlockLang(codeBlock string) string {
lines := strings.Split(codeBlock, "\n")
if len(lines) < 2 {
return ""
}
firstLine := strings.TrimSpace(lines[0])
firstLine = strings.TrimPrefix(firstLine, "```")
firstLine = strings.TrimSpace(firstLine)
if firstLine != "" {
lang := strings.ToLower(firstLine)
if _, ok := languagePatterns[lang]; ok {
return lang
}
}
return ""
}
func (p *Parser) BuildPrompt(result *ParseResult) string {
var prompt strings.Builder
prompt.WriteString("你是一位专业的技术翻译。请翻译以下内容,遵守以下规则:\n\n")
prompt.WriteString("需要翻译的部分:\n")
prompt.WriteString("- 普通文本:翻译成目标语言\n")
prompt.WriteString("- 代码注释:只翻译注释中有意义的词汇,技术术语保留原语言\n\n")
prompt.WriteString("需要保持不变的部分:\n")
prompt.WriteString("- 代码块(如 ```javascript ... ```)保持原样\n")
prompt.WriteString("- 行内代码(如 `const count = 10`)保持原样\n")
if len(p.skipKeywords) > 0 {
prompt.WriteString(fmt.Sprintf("- 以下关键词不翻译:%s\n", strings.Join(p.skipKeywords, "、")))
}
prompt.WriteString("\n请将需要翻译的部分翻译成中文其他部分保持不变。\n\n")
prompt.WriteString("原文:\n---\n")
textToTranslate := p.extractTextForTranslation(result)
prompt.WriteString(textToTranslate)
prompt.WriteString("\n---")
return prompt.String()
}
func (p *Parser) extractTextForTranslation(result *ParseResult) string {
var text strings.Builder
for _, seg := range result.Segments {
switch seg.Type {
case SegmentTypeText:
text.WriteString(seg.Content)
case SegmentTypeComment:
text.WriteString(seg.Content)
case SegmentTypeCodeBlock, SegmentTypeInlineCode:
}
}
return text.String()
}
func (p *Parser) Reconstruct(result *ParseResult, translatedText string) string {
translatedLines := strings.Split(translatedText, "\n")
var output strings.Builder
textIndex := 0
for _, seg := range result.Segments {
switch seg.Type {
case SegmentTypeText, SegmentTypeComment:
if textIndex < len(translatedLines) {
output.WriteString(translatedLines[textIndex])
textIndex++
}
case SegmentTypeCodeBlock, SegmentTypeInlineCode:
output.WriteString(seg.Content)
}
}
return output.String()
}
func min(a, b int) int {
if a < b {
return a
}
return b
}

317
internal/lang/lang.go Normal file
View File

@@ -0,0 +1,317 @@
package lang
import (
"fmt"
"sort"
"strings"
)
// 语言代码映射表
var languageMap = map[string]string{
// 中文变体
"cn": "zh-CN",
"zh": "zh-CN", // 默认简体中文
"zhcn": "zh-CN",
"zhtw": "zh-TW",
"zhhk": "zh-HK",
"zh-hans": "zh-CN",
"zh-hant": "zh-TW",
"chinese": "zh-CN",
"简体中文": "zh-CN",
"繁体中文": "zh-TW",
// 英语变体
"en": "en-US", // 默认美式英语
"us": "en-US",
"uk": "en-GB",
"gb": "en-GB",
"english": "en-US",
"美式英语": "en-US",
"英式英语": "en-GB",
// 日语
"jp": "ja",
"ja": "ja",
"japanese": "ja",
"日语": "ja",
// 韩语
"kr": "ko",
"ko": "ko",
"korean": "ko",
"韩语": "ko",
// 西班牙语
"es": "es-ES",
"spanish": "es-ES",
"西班牙语": "es-ES",
// 法语
"fr": "fr-FR",
"french": "fr-FR",
"法语": "fr-FR",
// 德语
"de": "de-DE",
"german": "de-DE",
"德语": "de-DE",
// 俄语
"ru": "ru-RU",
"russian": "ru-RU",
"俄语": "ru-RU",
// 葡萄牙语
"pt": "pt-PT",
"portuguese": "pt-PT",
"葡萄牙语": "pt-PT",
"br": "pt-BR", // 巴西葡萄牙语
// 意大利语
"it": "it-IT",
"italian": "it-IT",
"意大利语": "it-IT",
// 阿拉伯语
"ar": "ar-SA",
"arabic": "ar-SA",
"阿拉伯语": "ar-SA",
// 印地语
"hi": "hi-IN",
"hindi": "hi-IN",
"印地语": "hi-IN",
// 其他语言
"nl": "nl-NL", // 荷兰语
"dutch": "nl-NL",
"sv": "sv-SE", // 瑞典语
"swedish": "sv-SE",
"no": "nb-NO", // 挪威语
"norwegian": "nb-NO",
"da": "da-DK", // 丹麦语
"danish": "da-DK",
"fi": "fi-FI", // 芬兰语
"finnish": "fi-FI",
"pl": "pl-PL", // 波兰语
"polish": "pl-PL",
"tr": "tr-TR", // 土耳其语
"turkish": "tr-TR",
"th": "th-TH", // 泰语
"thai": "th-TH",
"vi": "vi-VN", // 越南语
"vietnamese": "vi-VN",
"id": "id-ID", // 印尼语
"indonesian": "id-ID",
"ms": "ms-MY", // 马来语
"malay": "ms-MY",
}
// 语言名称到代码的映射(用于显示)
var languageNames = map[string]string{
"zh-CN": "中文(简体)",
"zh-TW": "中文(繁体)",
"zh-HK": "中文(香港)",
"en-US": "English (US)",
"en-GB": "English (UK)",
"ja": "日本語",
"ko": "한국어",
"es-ES": "Español",
"fr-FR": "Français",
"de-DE": "Deutsch",
"ru-RU": "Русский",
"pt-PT": "Português",
"pt-BR": "Português (Brasil)",
"it-IT": "Italiano",
"ar-SA": "العربية",
"hi-IN": "हिन्दी",
"nl-NL": "Nederlands",
"sv-SE": "Svenska",
"nb-NO": "Norsk",
"da-DK": "Dansk",
"fi-FI": "Suomi",
"pl-PL": "Polski",
"tr-TR": "Türkçe",
"th-TH": "ไทย",
"vi-VN": "Tiếng Việt",
"id-ID": "Bahasa Indonesia",
"ms-MY": "Bahasa Melayu",
}
// ParseLanguageCode 解析语言代码
// 支持多种格式标准BCP47格式、别名、中文名称等
func ParseLanguageCode(input string) string {
if input == "" {
return ""
}
// 转换为小写进行匹配
lower := strings.ToLower(strings.TrimSpace(input))
// 直接匹配
if code, exists := languageMap[lower]; exists {
return code
}
// 尝试解析BCP47格式如 zh-CN, en-US
if isValidLanguageTag(input) {
return normalizeLanguageTag(input)
}
// 如果无法解析,返回原始输入
return input
}
// isValidLanguageTag 检查是否是有效的语言标签格式
func isValidLanguageTag(tag string) bool {
// 简单的格式检查:语言代码-地区代码
parts := strings.Split(tag, "-")
if len(parts) == 1 {
// 只有语言代码,如 "zh", "en"
return len(parts[0]) >= 2 && len(parts[0]) <= 3
}
if len(parts) == 2 {
// 语言代码-地区代码,如 "zh-CN", "en-US"
return len(parts[0]) >= 2 && len(parts[0]) <= 3 && len(parts[1]) == 2
}
return false
}
// normalizeLanguageTag 标准化语言标签
func normalizeLanguageTag(tag string) string {
parts := strings.Split(tag, "-")
if len(parts) == 1 {
// 只有语言代码,使用默认地区
defaultRegions := map[string]string{
"zh": "CN",
"en": "US",
"ja": "JP",
"ko": "KR",
"es": "ES",
"fr": "FR",
"de": "DE",
"ru": "RU",
"pt": "PT",
"it": "IT",
"ar": "SA",
"hi": "IN",
"nl": "NL",
"sv": "SE",
"no": "NO",
"da": "DK",
"fi": "FI",
"pl": "PL",
"tr": "TR",
"th": "TH",
"vi": "VN",
"id": "ID",
"ms": "MY",
}
if region, exists := defaultRegions[parts[0]]; exists {
return fmt.Sprintf("%s-%s", strings.ToLower(parts[0]), strings.ToUpper(region))
}
return tag
}
if len(parts) == 2 {
// 标准化格式:语言小写,地区大写
return fmt.Sprintf("%s-%s", strings.ToLower(parts[0]), strings.ToUpper(parts[1]))
}
return tag
}
// GetLanguageName 获取语言名称(用于显示)
func GetLanguageName(code string) string {
if name, exists := languageNames[code]; exists {
return name
}
return code
}
// GetLanguageNameOrDefault 获取语言名称,如果不存在则返回代码
func GetLanguageNameOrDefault(code string, defaultName string) string {
if name, exists := languageNames[code]; exists {
return name
}
return defaultName
}
// SupportedLanguages 获取支持的语言列表
func SupportedLanguages() []string {
codes := make([]string, 0, len(languageNames))
for code := range languageNames {
codes = append(codes, code)
}
sort.Strings(codes)
return codes
}
// GetLanguageSuggestions 获取语言建议(用于模糊匹配)
func GetLanguageSuggestions(input string, limit int) []string {
if input == "" {
return []string{}
}
lower := strings.ToLower(input)
suggestions := make([]string, 0)
for alias, code := range languageMap {
if strings.Contains(alias, lower) || strings.Contains(strings.ToLower(code), lower) {
// 避免重复
found := false
for _, s := range suggestions {
if s == code {
found = true
break
}
}
if !found {
suggestions = append(suggestions, code)
}
}
}
// 限制数量
if len(suggestions) > limit {
suggestions = suggestions[:limit]
}
return suggestions
}
// IsLanguageSupported 检查语言是否支持
func IsLanguageSupported(code string) bool {
normalized := ParseLanguageCode(code)
_, exists := languageNames[normalized]
return exists
}
// GetCommonLanguages 获取常用语言列表
func GetCommonLanguages() []string {
return []string{
"zh-CN", // 中文(简体)
"en-US", // 英语(美国)
"ja", // 日语
"ko", // 韩语
"es-ES", // 西班牙语
"fr-FR", // 法语
"de-DE", // 德语
"ru-RU", // 俄语
"pt-PT", // 葡萄牙语
"it-IT", // 意大利语
}
}
// GetLanguageDirection 获取语言方向(从左到右或从右到左)
func GetLanguageDirection(code string) string {
rtlLanguages := map[string]bool{
"ar-SA": true, // 阿拉伯语
"he-IL": true, // 希伯来语
"fa-IR": true, // 波斯语
"ur-PK": true, // 乌尔都语
}
if rtlLanguages[code] {
return "rtl"
}
return "ltr"
}

224
internal/lang/lang_test.go Normal file
View File

@@ -0,0 +1,224 @@
package lang
import (
"testing"
)
func TestParseLanguageCode(t *testing.T) {
tests := []struct {
input string
expected string
}{
// 中文变体
{"cn", "zh-CN"},
{"zh", "zh-CN"},
{"zh-CN", "zh-CN"},
{"zh-TW", "zh-TW"},
{"zh-HK", "zh-HK"},
{"chinese", "zh-CN"},
{"简体中文", "zh-CN"},
// 英语变体
{"en", "en-US"},
{"en-US", "en-US"},
{"en-GB", "en-GB"},
{"us", "en-US"},
{"uk", "en-GB"},
{"english", "en-US"},
// 其他语言
{"jp", "ja"},
{"ja", "ja"},
{"japanese", "ja"},
{"kr", "ko"},
{"ko", "ko"},
{"korean", "ko"},
{"es", "es-ES"},
{"spanish", "es-ES"},
{"fr", "fr-FR"},
{"french", "fr-FR"},
{"de", "de-DE"},
{"german", "de-DE"},
// 空值
{"", ""},
// 未知语言(应返回原始输入)
{"unknown", "unknown"},
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
result := ParseLanguageCode(tt.input)
if result != tt.expected {
t.Errorf("ParseLanguageCode(%q) = %q, 期望 %q", tt.input, result, tt.expected)
}
})
}
}
func TestGetLanguageName(t *testing.T) {
tests := []struct {
code string
expected string
}{
{"zh-CN", "中文(简体)"},
{"zh-TW", "中文(繁体)"},
{"en-US", "English (US)"},
{"en-GB", "English (UK)"},
{"ja", "日本語"},
{"ko", "한국어"},
{"es-ES", "Español"},
{"fr-FR", "Français"},
{"de-DE", "Deutsch"},
{"unknown", "unknown"}, // 未知代码返回原值
}
for _, tt := range tests {
t.Run(tt.code, func(t *testing.T) {
result := GetLanguageName(tt.code)
if result != tt.expected {
t.Errorf("GetLanguageName(%q) = %q, 期望 %q", tt.code, result, tt.expected)
}
})
}
}
func TestSupportedLanguages(t *testing.T) {
languages := SupportedLanguages()
if len(languages) == 0 {
t.Error("SupportedLanguages() 不应返回空列表")
}
// 检查一些关键语言是否在列表中
expectedLanguages := []string{"zh-CN", "en-US", "ja", "ko"}
for _, expected := range expectedLanguages {
found := false
for _, lang := range languages {
if lang == expected {
found = true
break
}
}
if !found {
t.Errorf("Expected language %q not found in supported languages", expected)
}
}
}
func TestGetLanguageSuggestions(t *testing.T) {
tests := []struct {
input string
limit int
minCount int
}{
{"zh", 5, 1},
{"en", 5, 1},
{"chinese", 5, 1},
{"", 5, 0},
{"unknown", 5, 0},
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
suggestions := GetLanguageSuggestions(tt.input, tt.limit)
if len(suggestions) < tt.minCount {
t.Errorf("GetLanguageSuggestions(%q, %d) 返回 %d 个建议,至少需要 %d 个",
tt.input, tt.limit, len(suggestions), tt.minCount)
}
if len(suggestions) > tt.limit {
t.Errorf("GetLanguageSuggestions(%q, %d) 返回 %d 个建议,超过限制 %d 个",
tt.input, tt.limit, len(suggestions), tt.limit)
}
})
}
}
func TestIsLanguageSupported(t *testing.T) {
tests := []struct {
code string
expected bool
}{
{"zh-CN", true},
{"en-US", true},
{"ja", true},
{"unknown", false},
{"", false},
}
for _, tt := range tests {
t.Run(tt.code, func(t *testing.T) {
result := IsLanguageSupported(tt.code)
if result != tt.expected {
t.Errorf("IsLanguageSupported(%q) = %v, 期望 %v", tt.code, result, tt.expected)
}
})
}
}
func TestGetCommonLanguages(t *testing.T) {
languages := GetCommonLanguages()
if len(languages) == 0 {
t.Error("GetCommonLanguages() 不应返回空列表")
}
// 检查一些关键语言是否在列表中
expectedLanguages := []string{"zh-CN", "en-US", "ja"}
for _, expected := range expectedLanguages {
found := false
for _, lang := range languages {
if lang == expected {
found = true
break
}
}
if !found {
t.Errorf("Expected language %q not found in common languages", expected)
}
}
}
func TestGetLanguageDirection(t *testing.T) {
tests := []struct {
code string
expected string
}{
{"zh-CN", "ltr"},
{"en-US", "ltr"},
{"ja", "ltr"},
{"ar-SA", "rtl"},
{"he-IL", "rtl"},
}
for _, tt := range tests {
t.Run(tt.code, func(t *testing.T) {
result := GetLanguageDirection(tt.code)
if result != tt.expected {
t.Errorf("GetLanguageDirection(%q) = %q, 期望 %q", tt.code, result, tt.expected)
}
})
}
}
func TestNormalizeLanguageTag(t *testing.T) {
tests := []struct {
input string
expected string
}{
{"zh", "zh-CN"},
{"en", "en-US"},
{"ja", "ja-JP"},
{"zh-CN", "zh-CN"},
{"zh-tw", "zh-TW"},
{"EN-us", "en-US"},
}
for _, tt := range tests {
t.Run(tt.input, func(t *testing.T) {
result := normalizeLanguageTag(tt.input)
if result != tt.expected {
t.Errorf("normalizeLanguageTag(%q) = %q, 期望 %q", tt.input, result, tt.expected)
}
})
}
}

305
internal/onboard/onboard.go Normal file
View File

@@ -0,0 +1,305 @@
package onboard
import (
"fmt"
"os"
"path/filepath"
"github.com/AlecAivazis/survey/v2"
"github.com/titor/fanyi/internal/config"
"github.com/titor/fanyi/internal/lang"
)
// RunOnboard 启动配置向导
func RunOnboard(force bool) error {
fmt.Println("欢迎使用YOYO翻译工具配置向导!")
fmt.Println("这个向导将帮助您配置翻译工具。")
fmt.Println()
// 检查配置文件是否存在
configPath := "configs/config.yaml"
if _, err := os.Stat(configPath); err == nil && !force {
overwrite := false
prompt := &survey.Confirm{
Message: "检测到配置文件已存在,是否要重新配置?",
Default: false,
}
if err := survey.AskOne(prompt, &overwrite); err != nil {
return fmt.Errorf("用户输入错误: %w", err)
}
if !overwrite {
fmt.Println("配置已取消。")
return nil
}
}
// 步骤1: 选择主要厂商
fmt.Println("步骤1: 选择主要翻译服务提供商")
providerName, err := SelectProvider()
if err != nil {
return fmt.Errorf("选择厂商失败: %w", err)
}
// 步骤2: 配置主要厂商
fmt.Println("\n步骤2: 配置主要厂商")
providerConfig, err := ConfigureProvider(providerName)
if err != nil {
return fmt.Errorf("配置厂商失败: %w", err)
}
// 步骤3: 全局设置
fmt.Println("\n步骤3: 全局设置")
globalConfig, err := GlobalSettings()
if err != nil {
return fmt.Errorf("全局设置失败: %w", err)
}
// 步骤4: 确认并保存配置
fmt.Println("\n步骤4: 保存配置")
configData := BuildConfig(providerName, providerConfig, globalConfig)
if err := SaveConfig(configData, configPath); err != nil {
return fmt.Errorf("保存配置失败: %w", err)
}
fmt.Printf("\n配置完成! 配置文件已保存到: %s\n", configPath)
fmt.Println("\n您现在可以使用以下命令进行翻译:")
fmt.Println(" yoyo \"Hello world\"")
fmt.Println(" yoyo --lang=cn \"Hello world\"")
fmt.Println("\n更多帮助请运行: yoyo --help")
return nil
}
// SelectProvider 选择主要厂商
func SelectProvider() (string, error) {
providers := []string{
"siliconflow",
"volcano",
"national",
"qwen",
"openai",
}
providerNames := map[string]string{
"siliconflow": "硅基流动 (推荐,免费额度)",
"volcano": "火山引擎",
"national": "国家超算",
"qwen": "Qwen (通义千问)",
"openai": "OpenAI兼容格式",
}
var selected string
prompt := &survey.Select{
Message: "请选择要使用的翻译服务提供商:",
Options: func() []string {
var opts []string
for _, p := range providers {
opts = append(opts, providerNames[p])
}
return opts
}(),
Default: providerNames["siliconflow"],
}
if err := survey.AskOne(prompt, &selected); err != nil {
return "", err
}
// 返回对应的厂商名称
for name, displayName := range providerNames {
if displayName == selected {
return name, nil
}
}
return "siliconflow", nil
}
// ConfigureProvider 配置厂商
func ConfigureProvider(providerName string) (config.ProviderConfig, error) {
// 厂商默认配置
defaults := map[string]config.ProviderConfig{
"siliconflow": {
APIHost: "https://api.siliconflow.cn/v1",
Model: "siliconflow-base",
Enabled: true,
},
"volcano": {
APIHost: "https://api.volcengine.com/v1",
Model: "volcano-chat",
Enabled: true,
},
"national": {
APIHost: "https://api.nsc.gov.cn/v1",
Model: "nsc-base",
Enabled: true,
},
"qwen": {
APIHost: "https://dashscope.aliyuncs.com/compatible-mode/v1",
Model: "qwen-turbo",
Enabled: true,
},
"openai": {
APIHost: "https://api.openai.com/v1",
Model: "gpt-3.5-turbo",
Enabled: true,
},
}
defaultConfig := defaults[providerName]
cfg := config.ProviderConfig{
APIHost: defaultConfig.APIHost,
Model: defaultConfig.Model,
Enabled: defaultConfig.Enabled,
}
// 输入API密钥
apiKeyPrompt := &survey.Input{
Message: fmt.Sprintf("请输入 %s 的API密钥:", providerName),
Help: "API密钥用于身份验证将存储在配置文件中",
}
if err := survey.AskOne(apiKeyPrompt, &cfg.APIKey, survey.WithValidator(survey.Required)); err != nil {
return config.ProviderConfig{}, err
}
// 确认API HOST
apiHostPrompt := &survey.Input{
Message: "API HOST (直接回车使用默认值):",
Default: cfg.APIHost,
}
if err := survey.AskOne(apiHostPrompt, &cfg.APIHost); err != nil {
return config.ProviderConfig{}, err
}
// 确认默认模型
modelPrompt := &survey.Input{
Message: "默认模型 (直接回车使用默认值):",
Default: cfg.Model,
}
if err := survey.AskOne(modelPrompt, &cfg.Model); err != nil {
return config.ProviderConfig{}, err
}
return cfg, nil
}
// GlobalSettings 全局设置
type GlobalConfig struct {
DefaultProvider string
DefaultModel string
Timeout int
DefaultSourceLang string
DefaultTargetLang string
}
// GlobalSettings 全局设置
func GlobalSettings() (*GlobalConfig, error) {
cfg := &GlobalConfig{
DefaultProvider: "siliconflow",
DefaultModel: "siliconflow-base",
Timeout: 30,
DefaultSourceLang: "auto",
DefaultTargetLang: "zh-CN",
}
// 选择默认语言
targetLangOptions := lang.GetCommonLanguages()
var targetLangDisplay []string
for _, code := range targetLangOptions {
targetLangDisplay = append(targetLangDisplay, fmt.Sprintf("%s (%s)", code, lang.GetLanguageName(code)))
}
targetLangPrompt := &survey.Select{
Message: "请选择默认目标语言:",
Options: targetLangDisplay,
Default: fmt.Sprintf("%s (%s)", "zh-CN", lang.GetLanguageName("zh-CN")),
}
var selectedTarget string
if err := survey.AskOne(targetLangPrompt, &selectedTarget); err != nil {
return nil, err
}
// 从选择中提取语言代码
for i, display := range targetLangDisplay {
if display == selectedTarget {
cfg.DefaultTargetLang = targetLangOptions[i]
break
}
}
// 设置超时时间
timeoutPrompt := &survey.Input{
Message: "API超时时间(秒):",
Default: fmt.Sprintf("%d", cfg.Timeout),
}
var timeoutStr string
if err := survey.AskOne(timeoutPrompt, &timeoutStr); err != nil {
return nil, err
}
// 解析超时时间
if timeout := parseIntOrDefault(timeoutStr, 30); timeout > 0 {
cfg.Timeout = timeout
}
return cfg, nil
}
// BuildConfig 构建配置对象
func BuildConfig(providerName string, providerConfig config.ProviderConfig, globalConfig *GlobalConfig) *config.Config {
// 创建厂商配置
providers := map[string]config.ProviderConfig{
providerName: providerConfig,
}
// 创建Prompt配置
prompts := map[string]string{
"technical": "你是一位专业的技术翻译,请准确翻译以下技术文档,保持专业术语的准确性。",
"creative": "你是一位富有创造力的翻译家,请用优美流畅的语言翻译以下内容。",
"academic": "你是一位学术翻译专家,请用严谨的学术语言翻译以下内容。",
"simple": "请用简单易懂的语言翻译以下内容。",
}
return &config.Config{
DefaultProvider: providerName,
DefaultModel: providerConfig.Model,
Timeout: globalConfig.Timeout,
DefaultSourceLang: globalConfig.DefaultSourceLang,
DefaultTargetLang: globalConfig.DefaultTargetLang,
Providers: providers,
Prompts: prompts,
}
}
// SaveConfig 保存配置文件
func SaveConfig(cfg *config.Config, path string) error {
// 确保目录存在
dir := filepath.Dir(path)
if err := os.MkdirAll(dir, 0755); err != nil {
return fmt.Errorf("创建配置目录失败: %w", err)
}
// 使用config包的Save方法
loader := &config.YAMLConfigLoader{}
return loader.Save(cfg, path)
}
// parseIntOrDefault 解析整数,失败时返回默认值
func parseIntOrDefault(s string, defaultValue int) int {
if s == "" {
return defaultValue
}
var result int
if _, err := fmt.Sscanf(s, "%d", &result); err != nil {
return defaultValue
}
if result <= 0 {
return defaultValue
}
return result
}

View File

@@ -6,6 +6,7 @@ import (
"time"
"github.com/titor/fanyi/internal/config"
"github.com/titor/fanyi/internal/content"
"github.com/titor/fanyi/internal/provider"
)
@@ -14,6 +15,7 @@ type Translator struct {
config *config.Config
provider provider.Provider
prompt *PromptManager
contentParser *content.Parser
}
// NewTranslator 创建翻译器实例
@@ -22,6 +24,7 @@ func NewTranslator(config *config.Config, provider provider.Provider) *Translato
config: config,
provider: provider,
prompt: NewPromptManager(config.Prompts),
contentParser: content.NewParser(config.SkipKeywords),
}
}
@@ -31,15 +34,33 @@ func (t *Translator) Translate(ctx context.Context, text string, options *Transl
timeoutCtx, cancel := context.WithTimeout(ctx, time.Duration(t.config.Timeout)*time.Second)
defer cancel()
// 基础字符过滤
filteredText := content.FilterBasic(text, nil)
// 内容解析(包含代码检测)
parseResult, parseErr := t.contentParser.Parse(filteredText)
// 选择Prompt
prompt := ""
if options.PromptName != "" {
prompt = t.prompt.GetPrompt(options.PromptName)
}
// 如果包含代码且解析成功使用增强的Prompt
if parseErr == nil && parseResult.HasCode {
enhancedPrompt := t.contentParser.BuildPrompt(parseResult)
if enhancedPrompt != "" {
if prompt != "" {
prompt = prompt + "\n\n" + enhancedPrompt
} else {
prompt = enhancedPrompt
}
}
}
// 构建请求
req := &provider.TranslateRequest{
Text: text,
Text: filteredText,
FromLang: options.FromLang,
ToLang: options.ToLang,
Prompt: prompt,
@@ -53,10 +74,17 @@ func (t *Translator) Translate(ctx context.Context, text string, options *Transl
return nil, fmt.Errorf("翻译失败: %w", err)
}
translatedText := resp.Text
// 如果包含代码且解析成功,重构结果
if parseErr == nil && parseResult.HasCode {
translatedText = t.contentParser.Reconstruct(parseResult, resp.Text)
}
// 构建结果
return &TranslateResult{
Original: text,
Translated: resp.Text,
Translated: translatedText,
FromLang: resp.FromLang,
ToLang: resp.ToLang,
Model: resp.Model,

View File

@@ -189,3 +189,68 @@ func main() {
2. AI编辑taolun.md记录讨论
3. AI更新changelog.md记录版本
4. AI更新memory.md记录经验
---
## 语言代码处理经验
### 语言代码标准化
**问题**: 需要支持多种语言代码格式,但内部应使用标准格式
**解决方案**:
1. 使用BCP 47语言标签作为标准格式`zh-CN``en-US`
2. 实现智能解析函数 `ParseLanguageCode()`
3. 支持别名映射(如 `cn``zh-CN``en``en-US`
**最佳实践**:
- 语言代码小写,地区代码大写(如 `zh-CN`,不是 `zh-cn`
- 提供语言名称映射用于显示(如 `zh-CN` → "中文(简体)"
- 支持模糊匹配和建议功能
### 交互式配置经验
**问题**: 命令行工具需要友好的配置界面
**解决方案**:
1. 使用 `github.com/AlecAivazis/survey/v2`
2. 实现分步配置流程
3. 提供默认值和确认选项
**注意事项**:
- 交互式库需要终端支持
- 提供非交互式模式(如配置文件模板)
- 错误处理要友好,避免程序崩溃
### 命令行参数解析经验
**问题**: Go标准库 `flag` 包功能有限,需要支持子命令
**解决方案**:
1. 使用 `flag` 包解析选项参数
2. 手动处理子命令(如 `onboard`
3. 提供清晰的帮助信息
**命名冲突处理**:
- 避免变量名与包名冲突(如 `onboard` 变量与 `onboard` 包)
- 使用后缀区分(如 `onboardFlag`
## 配置文件管理经验
### 开发阶段配置策略
**决策**: 开发阶段使用 `.env` + `configs/config.yaml`
**原因**:
1. 简化开发环境配置
2. 符合12-factor应用原则
3. 避免过早优化
**实施**:
- `.env` 文件存储API密钥等敏感信息
- `configs/config.yaml` 存储复杂配置结构
- 使用环境变量替换 `${VAR}`
### 配置文件格式选择
**决策**: 使用YAML格式
**原因**:
1. 人类可读性好
2. 支持复杂数据结构
3. Go生态支持良好
**注意事项**:
- 使用 `gopkg.in/yaml.v3`
- 注意缩进和格式
- 提供配置验证

View File

@@ -149,3 +149,79 @@
**关联文档**:
- [memory.md#环境变量加载问题](memory.md#环境变量加载问题)
- [changelog.md#0.0.3](changelog.md#003)
---
### [2026-03-29 10:00] 版本 0.2.0 - 语言代码解析设计
**原因**: 用户需要通过 `--lang` 参数指定目标语言,支持多种语言代码格式
**分析**:
- 需要支持标准BCP47格式`zh-CN``en-US`
- 需要支持简短别名(如 `cn``en`
- 需要支持中文名称(如 `chinese``english`
- 需要智能解析和错误提示
**解决方案**:
1. 创建 `internal/lang/lang.go` 模块
2. 实现语言代码映射表和解析函数
3. 支持大小写不敏感和模糊匹配
4. 提供语言名称获取和建议功能
**技术细节**:
- 使用 `map[string]string` 存储语言代码映射
- 实现 `ParseLanguageCode()` 函数进行智能解析
- 支持30+种语言和变体
- 添加完整的单元测试
**关联文档**:
- [AGENTS.md#语言代码处理](AGENTS.md#语言代码处理)
- [changelog.md#0.2.0](changelog.md#020)
---
### [2026-03-29 10:30] 版本 0.2.0 - onboard配置向导
**原因**: 用户需要友好的配置界面,特别是第一次使用时
**分析**:
- 需要交互式配置向导
- 需要支持选择厂商、输入API密钥、设置默认值
- 需要生成标准的YAML配置文件
- 需要支持强制重新配置
**解决方案**:
1. 使用 `github.com/AlecAivazis/survey/v2`
2. 实现分步配置流程:选择厂商 → 配置厂商 → 全局设置 → 保存
3. 提供友好的错误处理和用户提示
4. 支持 `--force` 参数强制重新配置
**技术细节**:
- 使用 `survey.Select``survey.Input``survey.Confirm` 组件
- 实现厂商默认配置和自定义选项
- 生成完整的配置文件包含所有必要字段
- 支持配置文件存在性检查
**关联文档**:
- [AGENTS.md#Onboard配置向导](AGENTS.md#onboard配置向导)
- [changelog.md#0.2.0](changelog.md#020)
---
### [2026-03-29 11:00] 版本 0.2.0 - 分阶段迁移策略
**原因**: 需要平衡开发便利性和最终上线需求
**分析**:
- 开发阶段需要简单配置方式(`.env` + `configs/config.yaml`
- 上线前需要迁移到用户配置目录(`~/.config/yoo/yoo.yml`
- 需要平滑的迁移路径和向后兼容性
**解决方案**:
1. **第一阶段(当前)**: 继续使用 `.env` + `configs/config.yaml`
2. **第二阶段(上线前)**: 实现配置文件路径查找和迁移工具
3. **第三阶段(最终)**: 移除 `.env` 依赖,完全使用配置文件
**技术细节**:
- 配置文件路径优先级:命令行 > 环境变量 > 用户目录 > 当前目录
- 保持向后兼容性,支持旧配置格式
- 提供配置验证和错误提示
- 实现配置迁移工具(计划)
**关联文档**:
- [AGENTS.md#分阶段迁移策略](AGENTS.md#分阶段迁移策略)
- [changelog.md#0.2.0](changelog.md#020)