Ruby 中文编码 - 程序员自由职业

在Ruby中，字符串的编码（encoding）处理是非常重要的，尤其是当你处理中文字符时。Ruby 2.0及更高版本引入了更强大的编码支持。以下是一些处理中文编码的基本概念和常见操作：

字符串编码（Encoding）：

Ruby的字符串对象可以包含不同的字符编码。你可以使用 encoding 方法查看字符串的编码：

str = "你好"
puts str.encoding

转换编码：

你可以使用 encode 方法来转换字符串的编码：

utf8_str = str.encode("UTF-8")
puts utf8_str.encoding

字符串连接：

在连接字符串时，确保它们有相同的编码：

str1 = "你好"
str2 = "こんにちは"

# 错误示范
# combined_str = str1 + str2  # 会导致 Encoding::CompatibilityError

# 正确示范
combined_str = str1 + str2.force_encoding("UTF-8")
puts combined_str

中文正则表达式：

在使用正则表达式时，确保正则表达式和字符串具有相同的编码：

pattern = /你好/
matched = str.match(pattern)
puts matched[0] if matched

文件编码：

当处理文件时，确保读取和写入的文件使用相同的编码：

# 读取文件
File.open("filename.txt", "r:UTF-8") do |file|
  content = file.read
  puts content
end

# 写入文件
File.open("output.txt", "w:UTF-8") do |file|
  file.puts "你好"
end

注释中文：

在Ruby中，注释也应该使用UTF-8编码，确保文件的整体编码一致。

# 这是一个中文注释

确保在处理中文字符时，你的代码中使用了正确的编码，避免编码不一致导致的问题。 Ruby的编码支持较为灵活，但在处理多语言字符时需要谨慎。

转载请注明出处：http://www.zyzy.cn/article/detail/6440/Ruby