前言

最近新项目重新评估了一下protobuf的C/C++ -> Lua binding 方案。之前,使用最广泛的 Lua binding 方案应该是 云风pbc 。但是这个库已经是作者弃坑好多年的状态了。我之前使用 pbc 的时候刚碰上 protobuf 3.0 刚出来,当时打了patch来适配 protobuf 3.0 ,还修复了一些其他问题。这个Patch有些推给了上游,有些因为和上游的某些机制冲突没有推。我了解到的很多其他项目也或多或少的打了自己的Patch,大多数也没往上游推。基本上 pbc 已经处于一个失维的状态,所以这次新项目就干脆来寻求更好,或者说仍然有良好活跃度的解决方案。于是就看向了 upb

upb 是一个 protobuf 官方的子项目,运行时使用纯C编写。代码生成工具使用C++,依赖 libprotoc 和 abseil-cpp 。并且 upb 是现在protobuf的Ruby, PHP和Python的官方运行时库,里面也包含了Lua binding模块。这样不仅技术支持有保障(有主流项目踩坑),并且能跟进protobuf的新功能。 在Lua binding方面,我们还有一些特殊的地方。因为编译的时候需要包含lua的头文件和设置链接库,而我们需要支持各种不同的lua运行时。(比如我们客户端用Unlua并集成到Unreal Engine,服务器用官方的lua 5.4,我们这边还有项目组定制化了Lua抹除了浮点数)。所以对于Lua binding,我们不能用预编译包的方式发布,只能用源码发布。所幸涉及的文件不多,还比较好处理。

目前看起来 upb 的工程管理还不是很规范,bazel构建系统可能用起来还比较简单,我们项目组主要使用cmake,就需要适配一下。实际上 upb 提供了工具通过bazel来生成cmake的工程文件,只是生成的文件有一些问题,那么本文主要记录一下适配过程中踩的一些坑和我们的解决方案。

Lua 5.3+ 版本适配

第一个问题是最简单最容易解决的,upb 的Lua binding代码虽然溜了一些适配代码。不过我估计没人测试过。有一些头文件和函数名使用时有问题的。这类版本问题都比较好处理,适配一下就行了。 这个Lua版本兼容的问题已经提 issue 到 https://github.com/protocolbuffers/upb/issues/734 。解决方案的PR也已经推到 https://github.com/protocolbuffers/upb/pull/735 。等啥时候和入了应该就好了。

一处C库的未定义行为调用

在 Lua binding 的代码中( upb/bindings/lua/upbc.cc ) 有一个地方是这么写的。

static void PrintString(int max_cols, absl::string_view* str,
                        protobuf::io::Printer* printer) {
  printer->Print("\'");
  while (max_cols > 0 && !str->empty()) {
    char ch = (*str)[0];
    if (ch == '\\') {
      printer->PrintRaw("\\\\");
      max_cols--;
    } else if (ch == '\'') {
      printer->PrintRaw("\\'");
      max_cols--;
    } else if (isprint(ch)) {
      printer->WriteRaw(&ch, 1);
      max_cols--;
    } else {
      unsigned char byte = ch;
      printer->PrintRaw("\\x");
      PrintHexDigit(byte >> 4, printer);
      PrintHexDigit(byte & 15, printer);
      max_cols -= 4;
    }
    str->remove_prefix(1);
  }
  printer->Print("\'");
}

这里 isprint(ch)ch 是一个 char 。而按照 https://en.cppreference.com/w/c/string/byte/isprint 的说法。这个函数接受的参数虽然是个 int 但它必须是个 unsigned char 或者 EOF 。否则是UB。实际上,这里对于utf-8字符串或者unicode,这里取到的 ch 可能为负数,于是隐式转换成 int 后不是一个 unsigned char 。在MSVC下此处会崩溃。最简单的方法是把 isprint(ch) 改成 isprint(static_cast<int>(static_cast<unsigned char>(ch)))

cmake生成工具的问题

第二个问题是新版本的 upb 中移除了 CMakeLists.txt 文件。按我和社区讨论的说法是本来应该仓库里提供这个文件的,不知为何最新版本消失了。其实 upb 是提供了工具,通过Bazel工程生成 CMakeLists.txt 文件的,但是生成的 CMakeLists.txt 文件有两个问题。第一个问题是新增的依赖库 utf8_range 中,链接的target的名字带了 / ,因为它在 Bazel中的target名字是 //third_party/utf8_range:utf8_range 。第二个问题是它没有生成 export 配置。导致我们无法使用cmake自带的 find_package 功能自动设置依赖包含目录和自动按需链接库。而 upb 内部模块还比较多,管理起来还比较麻烦。

对于第二个问题, vcpkg 的做法是打了个巨大的patch。 https://github.com/microsoft/vcpkg/blob/master/ports/upb/0001-make-cmakelists-py.patch 。而我也一样打了个更大的Patch,解决了这两个问题,和前面提到的Lua 5.3+ 版本适配和未定义行为调用的问题。

grpc 支持

非常悲伤的是 grpc 也依赖 upb ,并且它是以源码subtree的方式引入的,grpc 自己写了编译脚本,仅仅引入了运行时。显然如果我们想同时使用 grpcupb 的Lua binding。我们就必须编译出 upbprotoc-gen-lua 插件,并且使用同一版本。这导致我不得不写两份Patch。一份在 grpc 里,一份在 upb 里。

这两份Patch都写在之前我写的构建系统 cmake-toolset 里了, 地址在 https://github.com/atframework/cmake-toolset/tree/main/ports/grpc ,有需要的小伙伴可以自取。

Lua binding和测试小工具

也是为了方便测试,我在 cmake-toolset 构建系统的Test里写了个小工具,可以加载 upb 的Lua binding然后直接命令行做测试。 只要clone cmake-toolset 之后,使用 bash ci/do_ci.sh gcc.standalone-upb.testpwsh -File ci/do_ci.ps1 msvc.standalone-upb.test 就可以生成。 然后通过 cmake-toolset-upb-and-lua-binding-test [Lua查找目录或要执行的Lua文件...] 来执行指定Lua脚本。

xresloader 读表工具集成

使用上面的测试小工具,我顺带做了对我们项目组使用的转表工具 xresloader 所生成的二进制文件的读表支持。 还是通过 xres-code-generator 来自动生成加载代码,和C++接口与纯Lua接口一样,支持多版本管理、多重索引( Key-Value 型和 Key-List 型 )和lazy load。

比如在以下proto中:

syntax = "proto3";

import "xresloader.proto";
import "xresloader_ue.proto";

import "xrescode_extensions_v3.proto";

message role_upgrade_cfg {
    option (org.xresloader.ue.helper)       = "helper";
    option (org.xresloader.msg_description) = "Test role_upgrade_cfg with multi keys";

    option (xrescode.loader) = {
        file_path : "role_upgrade_cfg.bytes"
        indexes : {
            // name: "id"
            fields : "Id"
            index_type : EN_INDEX_KL // Key - List index: (Id) => list<role_upgrade_cfg>
        }
        indexes : {
            // name: "id_level" // 默认索引名字为所有的field名转小写用下划线连接
            fields : "Id"
            fields : "Level"
            index_type : EN_INDEX_KV // Key - Value index: (Id, Level) => role_upgrade_cfg
        }
        tags : "client"
        tags : "server"
    };

    uint32 Id        = 1 [ (org.xresloader.ue.key_tag) = 1000 ];
    uint32 Level     = 2 [ (org.xresloader.ue.key_tag) = 1 ];
    uint32 CostType  = 3 [ (org.xresloader.verifier) = "cost_type", (org.xresloader.field_description) = "Refer to cost_type" ];
    int64  CostValue = 4;
    int32  ScoreAdd  = 5;
}

生成的代码使用方式如下:

-- We will use require(...) to load
--   - DataTableServiceUpb
--   - DataTableCustomIndexUpb
--   - xrescode_extensions_v3_pb
--   - pb_header_v3_pb
--   - upb
--   - google/protobuf/descriptor_pb
--   - Other custom proto files generated by protoc-gen-lua
-- Please ensure these can be load by require(FILE_PATH)
local excel_config_service = require("DataTableServiceUpb")
local upb = require("upb")

-- Set logger
-- excel_config_service:OnError = function (message, data_set, indexName, keys...) end
-- excel_config_service:OnInfo = function (message, data_set, indexName) end

excel_config_service:ReloadTables()

-- 获取索引的message名字
local role_upgrade_cfg = excel_config_service:Get("role_upgrade_cfg")
print("======================= Lazy load begin =======================")
-- 按(索引名, key...) 获取Excel中一行
local data = role_upgrade_cfg:GetByIndex("id_level", 10001, 3) -- using the Key-Value index: id_level
print("======================= Lazy load end =======================")

print("----------------------- Get by Key-Value index -----------------------")
-- upb自带序列化成json的功能
print(string.format("Data of role_upgrade_cfg: id=10001, level=3 -> json_encode: %s",
    upb.json_encode(data, { upb.JSONENC_PROTONAMES })))

print("----------------------- Get by reflection and Key-List index -----------------------")
-- 获取当前配置分组,也就是当前版本。当然也可以获取其他版本
local current_group = excel_config_service:GetCurrentGroup()
-- 获取message名字和索引的名字获取
local role_upgrade_cfg2 = excel_config_service:GetByGroup(current_group, "role_upgrade_cfg")
local data2 = role_upgrade_cfg2:GetByIndex("id", 10001) -- using the Key-List index: id
for _, v1 in ipairs(data2) do
    print(string.format("\tid: %s, level: %s", tostring(v1.Id), tostring(v1.Level)))
    -- 此处是反射接口测试
    for fds in role_upgrade_cfg2:GetMessageDescriptor():fields() do
        print(string.format("\t\t%s=%s", fds:name(), tostring(v1[fds:name()])))
    end
end

记录一个Python的crash

在我写一个Python工具的时候碰到一个和 upb 有关的Crash BUG。其实应该是 protobuf的 python实现的BUG,不是 upb 的BUG,也记录在这里吧。

Crash位置:

#0  0x00007fd147b64b20 in upb_Array_Size () from /home/owent/workspace/tencent/pix/server/third_party/install/linux-x86_64-gnu/.modules/lib/python3.9/site-packages/google/_upb/_message.abi3.so
#1  0x00007fd147b51a79 in PyUpb_ExtensionDict_Contains () from /home/owent/workspace/tencent/pix/server/third_party/install/linux-x86_64-gnu/.modules/lib/python3.9/site-packages/google/_upb/_message.abi3.so
#2  0x00007fd14f3621d0 in _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x7fd14753fa40, throwflag=<optimized out>) at Python/ceval.c:3036
#3  0x00007fd14f30df02 in _PyEval_EvalFrame (throwflag=0, f=0x7fd14753fa40, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#4  function_code_fastcall (tstate=0x89e150, co=<optimized out>, args=<optimized out>, nargs=4, globals=<optimized out>) at Objects/call.c:330
#5  0x00007fd14f3617d6 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0xca1e68, callable=0x7fd1473a8700, tstate=0x89e150) at ./Include/cpython/abstract.h:118
#6  PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0xca1e68, callable=0x7fd1473a8700) at ./Include/cpython/abstract.h:127
#7  call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x89e150) at Python/ceval.c:5077
#8  _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0xca1c80, throwflag=<optimized out>) at Python/ceval.c:3506
#9  0x00007fd14f3606cf in _PyEval_EvalFrame (throwflag=0, f=0xca1c80, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#10 _PyEval_EvalCode (tstate=tstate@entry=0x89e150, _co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=0x7fd147c1a058, kwargs=0x7fd147d564a8, kwcount=7, kwstep=1, 
    defs=0x7fd147b88e38, defcount=7, kwdefs=0x0, closure=0x0, name=0x7fd14f5ca530, qualname=0x7fd147c27e40) at Python/ceval.c:4329
#11 0x00007fd14f30dd69 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:396
#12 0x00007fd14f30d27d in _PyObject_FastCallDictTstate (tstate=0x89e150, callable=0x7fd1473a88b0, args=<optimized out>, nargsf=<optimized out>, kwargs=<optimized out>) at Objects/call.c:129
#13 0x00007fd14f30e68b in _PyObject_Call_Prepend (tstate=tstate@entry=0x89e150, callable=callable@entry=0x7fd1473a88b0, obj=obj@entry=0x7fd147d0a4c0, args=args@entry=0x7fd147da0b20, kwargs=kwargs@entry=0x7fd147cdbac0) at Objects/call.c:489
#14 0x00007fd14f33f839 in slot_tp_init (self=0x7fd147d0a4c0, args=0x7fd147da0b20, kwds=0x7fd147cdbac0) at Objects/typeobject.c:6969
#15 0x00007fd14f33cd4b in type_call (type=<optimized out>, args=0x7fd147da0b20, kwds=0x7fd147cdbac0) at Objects/typeobject.c:1026
#16 0x00007fd14f30d553 in _PyObject_MakeTpCall (tstate=0x89e150, callable=0x91ad90, args=0x95ab48, nargs=<optimized out>, keywords=<optimized out>) at Objects/call.c:191
#17 0x00007fd14f365e20 in _PyObject_VectorcallTstate (kwnames=0x7fd147e45820, nargsf=<optimized out>, args=0x95ab48, callable=0x91ad90, tstate=<optimized out>) at ./Include/cpython/abstract.h:116
#18 _PyObject_VectorcallTstate (kwnames=0x7fd147e45820, nargsf=<optimized out>, args=0x95ab48, callable=0x91ad90, tstate=<optimized out>) at ./Include/cpython/abstract.h:103
#19 PyObject_Vectorcall (kwnames=0x7fd147e45820, nargsf=<optimized out>, args=0x95ab48, callable=0x91ad90) at ./Include/cpython/abstract.h:127
#20 call_function (kwnames=0x7fd147e45820, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at Python/ceval.c:5077
#21 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x95a910, throwflag=<optimized out>) at Python/ceval.c:3537
#22 0x00007fd14f3606cf in _PyEval_EvalFrame (throwflag=0, f=0x95a910, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#23 _PyEval_EvalCode (tstate=tstate@entry=0x89e150, _co=<optimized out>, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x8f3190, kwcount=0, kwstep=1, defs=0x0, 
    defcount=0, kwdefs=0x0, closure=0x0, name=0x7fd147f565f0, qualname=0x7fd147f565f0) at Python/ceval.c:4329
#24 0x00007fd14f30dd69 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:396
#25 0x00007fd14f361550 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775808, args=0x8f3190, callable=0x7fd147ccfca0, tstate=0x89e150) at ./Include/cpython/abstract.h:118
#26 PyObject_Vectorcall (kwnames=0x0, nargsf=9223372036854775808, args=0x8f3190, callable=0x7fd147ccfca0) at ./Include/cpython/abstract.h:127
#27 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x89e150) at Python/ceval.c:5077
#28 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x8f3020, throwflag=<optimized out>) at Python/ceval.c:3520
#29 0x00007fd14f3606cf in _PyEval_EvalFrame (throwflag=0, f=0x8f3020, tstate=0x89e150) at ./Include/internal/pycore_ceval.h:40
#30 _PyEval_EvalCode (tstate=0x89e150, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, 
    closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4329
#31 0x00007fd14f360382 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, 
    defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4361
#32 0x00007fd14f360323 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0)
    at Python/ceval.c:4377
#33 0x00007fd14f3d20fb in PyEval_EvalCode (co=co@entry=0x7fd147db1500, globals=globals@entry=0x7fd147f50100, locals=locals@entry=0x7fd147f50100) at Python/ceval.c:828
#34 0x00007fd14f3e3123 in run_eval_code_obj (tstate=0x89e150, co=0x7fd147db1500, globals=0x7fd147f50100, locals=0x7fd147f50100) at Python/pythonrun.c:1221
#35 0x00007fd14f3e30aa in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7fd147f50100, locals=0x7fd147f50100, flags=<optimized out>, arena=<optimized out>) at Python/pythonrun.c:1242
#36 0x00007fd14f2b77a1 in pyrun_file (fp=fp@entry=0x8d81f0, filename=filename@entry=0x7fd147d97b70, start=start@entry=257, globals=globals@entry=0x7fd147f50100, locals=locals@entry=0x7fd147f50100, closeit=closeit@entry=1, 
    flags=0x7ffe748d3e88) at Python/pythonrun.c:1140
#37 0x00007fd14f2b754d in pyrun_simple_file (flags=0x7ffe748d3e88, closeit=1, filename=0x7fd147d97b70, fp=0x8d81f0) at Python/pythonrun.c:450
#38 PyRun_SimpleFileExFlags (fp=fp@entry=0x8d81f0, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7ffe748d3e88) at Python/pythonrun.c:483
#39 0x00007fd14f2b840d in PyRun_AnyFileExFlags (fp=fp@entry=0x8d81f0, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7ffe748d3e88) at Python/pythonrun.c:92
#40 0x00007fd14f3eac02 in pymain_run_file (cf=0x7ffe748d3e88, config=0x89ac90) at Modules/main.c:373
#41 pymain_run_python (exitcode=0x7ffe748d3e80) at Modules/main.c:598
#42 Py_RunMain () at Modules/main.c:677
#43 0x00007fd14f3ea757 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731
#44 0x00007fd14e4fe555 in __libc_start_main () from /lib64/libc.so.6
#45 0x000000000040108e in _start ()

相关结构定义:


size_t upb_Array_Size(const upb_Array* arr) { return arr->size; }

/* Our internal representation for repeated fields.  */
struct upb_Array {
  uintptr_t data;  /* Tagged ptr: low 3 bits of ptr are lg2(elem size). */
  size_t size;     /* The number of elements in the array. */
  size_t capacity; /* Allocated storage. Measured in elements. */
  uint64_t junk;
};

System V X86_64 调用约定 中参数传入的寄存器顺序为: rdi, rsi, rdx, rcx, r8, r9

(gdb) info all-registers
rax            0x0                 0
rbx            0x7fd1473bb2f0      140536819987184
rcx            0x0                 0
rdx            0x7fd1477d77b0      140536824297392
rsi            0x9b8af0            10193648
rdi            0x0                 0
rbp            0x7fd14753fa40      0x7fd14753fa40
rsp            0x7ffe748d3288      0x7ffe748d3288
r8             0x243f6a8885a308d3  2611923443488327891
r9             0x3f                63
r10            0xfcabbfeccfd89b2a  -239887131313923286
r11            0x13198a2e03707344  1376283091369227076
r12            0x7fd147cdc06e      140536829558894
r13            0x7fd14753fbf0      140536821578736
r14            0xd03df0            13647344
r15            0xa40328            10748712
rip            0x7fd147b64b20      0x7fd147b64b20 <upb_Array_Size>

以上,PyUpb_ExtensionDict_Containsupb_Array_Size 的调用,参数传入了 nullptr

这个BUG已经提交到 https://github.com/protocolbuffers/protobuf/issues/10396 ,有碰到同样问题的小伙伴也可以关注下进展。

最后