C++20 中 module 的使用方法

徐登峰 · 发表于 2023-2-10 18:37:00

C++20 中引入了 module，使用 module 有以下优点：

#include 相当于复制粘贴，#include 多个头文件时编译很慢，使用 module 相当于直接调用编译好的二进制文件，这个二进制文件中描述了这个 module 导出的函数、类、模板等。
#include 多个头文件可能导致宏定义污染，有时会依赖于 #include 的顺序，可能会对 .cpp 源文件的编译带来意想不到的错误。使用 import module 不用管顺序，也不会带来宏定义的问题，因为 module 中只有显式 export 的部分才能被调用它的源文件看到。
如果 module 中的函数和调用这个 module 的源文件中的函数同名，不会产生编译错误，一个 module 是一个独立的整体。

一些概念

module: 可以看成一些 module unit 的集合。一个 module 可以 export 任意的 C++ 函数、类、常数等等，在另一个文件中 import 这个 module 就可以使用这些函数和类。
module unit: 跟 translation unit 的概念类似，也是一个 C++ 文件，只不过用在 module 中。一个 module 可包含一个或多个 module unit。
module interface: 一个 module 的接口，声明了这个 module 导出了哪些函数和类，与传统的 .h 头文件类似。Visual Studio 中使用 .ixx 作为 module interface 的文件后缀。
module implementation: 实现了 module interface 中导出的那些函数和类，与传统的 .cpp 源文件类似。Visual Studio 中使用传统的 .cpp 作为 module implementation 的文件后缀，概念上也不需要区分，可以统一当作源文件考虑。
primary module interface: 一个 module 通常包含多个文件，其中有一个文件定义了 module 作为一个整体的公共接口。在使用 module partition 时， primary module interface 中通常 import 了所有的 interface partition 并 export 其中的一部分。

自己创建 module

module unit 的文件结构

module interface 文件：特征语句为 export module [module-name];
头部包含 global module fragment 和 module preamble，前者用于传统的 #define 和 #include，尤其是在头文件需要用宏来更改设置时需要用；后者用于 import 其他用到的 module。这两块区域的顺序很重要。
// XXX.ixx 文件

module; // 非必要，表明 global module fragment 的开始
// ------------- global module fragment ---------------
// 这里可以使用 #include 和 #define
// 但是这里 include 的东西只能被这个 interface 文件看到，不能用于其他 module implementation 文件
// ----------------------------------------------------

export module [module-name]; // 必要，表明 module preamble 的开始
// ------------------ module preamble ------------------
// 这里可以 import 其他 module
// 这些 module 可以被所有属于当前 module 的文件使用
// ------------------------------------------------------

// 当前 module 从这里开始
// export 函数、类、常数等等

module :private; // 非必要，表明下面是私有的 module partition.

// 这里的东西只有当前文件能看到

module implementation 文件：特征语句为 module [module-name];
// XXX.cpp 文件

// 这里可以 #include 或者 import ，这些仅能被当前文件使用
// 跟这个文件关联的 module interface 中 import 的部分可以直接使用

module [module-name]; // 必要，表明当前 implementation 属于哪个 module

// implementation
module interface 与 module implementation 的主要区别在于声明 module 时是否存在 export 关键字。

简单示例

对于结构简单的 module，可以将具体实现放在 interface 文件中，例如
// math.ixx 文件

// 没有 global module fragment

export module math;
//
// 没有 module preamble
//

export // The module&#39;s interface
{
  auto square(const auto& x);
  const double lambda = 1.303577269034296391257;  // Conway&#39;s constant
  enum class Oddity { Even, Odd };
  auto getOddity(int x);
}

// 具体实现
auto square(const auto& x) { return x * x; }
bool isOdd(int x) { return x % 2 != 0; }
auto getOddity(int x) { return isOdd(x) ? Oddity::Odd : Oddity::Even; }

再来看稍复杂一点的 module，包含一个 interface 文件和两个 implementation 文件。roman.ixx 文件作为 interface 导出了两个函数 to_roman 和 from_roman，两个 .cpp 文件分别实现这两个函数。
// roman.ixx – Interface file for a Roman numerals module

export module roman;
import <string>;
import <string_view>;

export std::string to_roman(unsigned int i);
export unsigned int from_roman(std::string_view roman);
// to_roman.cpp – Implementation of the to_roman() function
module roman;
std::string to_roman(unsigned int i)
{
  if (i > 3999) return {}; // 3999, or MMMCMXCIX, is the largest standard Roman numeral
  static const std::string ms[] { &#34;&#34;,&#34;M&#34;,&#34;MM&#34;,&#34;MMM&#34; };
  static const std::string cds[]{ &#34;&#34;,&#34;C&#34;,&#34;CC&#34;,&#34;CCC&#34;,&#34;CD&#34;,&#34;D&#34;,&#34;DC&#34;,&#34;DCC&#34;,&#34;DCCC&#34;,&#34;CM&#34; };
  static const std::string xls[]{ &#34;&#34;,&#34;X&#34;,&#34;XX&#34;,&#34;XXX&#34;,&#34;XL&#34;,&#34;L&#34;,&#34;LX&#34;,&#34;LXX&#34;,&#34;LXXX&#34;,&#34;XC&#34; };
  static const std::string ivs[]{ &#34;&#34;,&#34;I&#34;,&#34;II&#34;,&#34;III&#34;,&#34;IV&#34;,&#34;V&#34;,&#34;VI&#34;,&#34;VII&#34;,&#34;VIII&#34;,&#34;IX&#34; };
  return ms[i / 1000] + cds[(i % 1000) / 100] + xls[(i % 100) / 10] + ivs[i % 10];
}

// from_roman.cpp – Implementation of the from_roman() function
module roman;
unsigned int char_to_roman(char c)
{
  switch (c)
  {
case &#39;I&#39;: return 1; case &#39;V&#39;: return 5; case &#39;X&#39;: return 10;
case &#39;L&#39;: return 50; case &#39;C&#39;: return 100; case &#39;D&#39;: return 500;
case &#39;M&#39;: return 1000; default:  return 0;
  }
}
unsigned int from_roman(std::string_view roman)
{
  unsigned int result{};
  for (size_t i{}, n{ roman.length() }; i < n; ++i)
  {
const auto j{ char_to_roman(roman) }; // Integer value of the i&#39;th roman digit
// Look at the next digit (if there is one) to know whether to add or subtract j
if (i + 1 == n || j >= char_to_roman(roman[i + 1])) result += j; else result -= j;
  }
  return result;
}

module 的拆分

对于更大型的 module ，可以使用 submodule 和 module partition 这两种方式来将它拆分成小一些的部分。
submodule

其实 C++ 中并没有 submodule 这个概念，submodule 只是通过带有 . 的 module 名称虚拟出来的。例如上面例子中的 module roman 可以拆分成三个 (sub)module：
// roman.ixx – Module interface file of the roman module
export module roman;
export import roman.from; // Not: &#39;export import .from;&#39; (cf. partitions later)
export import roman.to;

// roman.from.ixx – Module interface file of the roman.from module
export module roman.from;
import <string_view>;
export unsigned int from_roman(std::string_view roman);

// roman.to.ixx – Module interface file of the roman.to module
export module roman.to;
import <string>;
export std::string to_roman(unsigned int i);
roman , roman.from 和 roman.to 是三个完全独立的、地位平等的 module，只不过我们在 module 名称上加了 . 来虚拟出一个层级关系。把 module 名称 roman.from 和 roman.to 换成 abc 和 xyz 也是可行的。

&#34;submodules&#34; aren&#39;t submodules, they are modules with &#34;hierarchy-suggesting names&#34;.
The language does not enforce any hierarchical naming scheme at all. It&#39;s just that adopting one makes it easier to see the relation between modules and its submodules, and dots were specifically allowed in module names to facilitate such hierarchical naming.

module partition

当一个 module 很大时，可以用 partition 的方式分割成一些小的部分，这些小部分都属于同一个module。module partition 保证了它只能在这个 module 内被其他的部分调用，在一个 module 外部并不能直接调用 module  partition。

submodule vs. module partition: 主要区别在于 submodule 可以在 module 外部被独立导入，但是 partition 仅仅在当前 module 内部才是可见的。
可以对一个 module 的 interface 进行 partition，也可以对 implementation 进行 partition。
一个 module 的 implementation 本身可以拆分到几个 .cpp 文件中实现，那么对 implementation 进行 partition 的好处在哪里呢？
答案：有时需要在一个 module 的几个 .cpp 文件之间共享一些只在当前 module 能看到的数据，但是不想把这些数据声明在所有 module unit 共享的 module interface 文件中。

interface partition 和 implementation partition 的结构类似于一个 module 的 interface 和 implementation，只不过在声明时从 module-name 改成了 module-name:partition-name。
interface partition 的文件结构如下：
module; // 非必要，表明 global module fragment 的开始
// ------------- global module fragment ---------------
// 这里可以使用 #include 和 #define
// 但是这里 include 的东西只能被这个 interface 文件看到，不能用于其他 module implementation 文件
// ------------------------------------------------------

export module [Module-name]:[Partition-name]; // 必要，表明 module preamble 的开始
// ------------------ module preamble ------------------
// 这里可以 import 其他 module
//
// 也可以 import 这个 module 的其他 partition
// 只需要 partition 名称，不需要 module 名称
// 例如 import :[partition2-name];
// ------------------------------------------------------

// export 函数、类、常数等等

module :private; // 非必要，表明下面是私有的 module partition.
// 这里的东西只有当前文件能看到
implementation partition 的结构如下
// XXX.cpp 文件

// 这里可以 #include 或者 import ，这些仅能被当前文件使用
// 跟这个文件关联的 module interface 中 import 的部分可以直接使用

module [Module-name]:[Partition-name]; // 必要，表明当前 implementation 属于哪个 module

// implementation
以上述 roman module 为例，将它进行如下拆分：roman:to 和 roman:from 是两个 interface partition，其中 roman:to 中已经包含了函数定义， roman:from 中仅进行了声明，定义在 from_roman.cpp 中给出。roman:internals 是一个 implementation partition，它在 from_roman.cpp 中被调用。
// ---------------------------------------------------------------------------------
// roman.ixx – Primary module interface file for the roman module
export module roman;
export import :to;    // Not: &#39;export import roman:to;&#39;
export import :from; // Not: &#39;export import roman:from;&#39;
// export import :internals;  /* Error: only interface partitions can be exported */

// ---------------------------------------------------------------------------------
// roman-to.ixx – Module interface file for the &#39;to&#39; partition
export module roman:to;
import <string>;
export std::string to_roman(unsigned int i)
{
  // Same function body as before...
}

// ---------------------------------------------------------------------------------
// roman-from.ixx – Module interface file for the &#39;from&#39; partition
export module roman:from;
import <string_view>;
export unsigned int from_roman(std::string_view roman);

// ---------------------------------------------------------------------------------
// roman-internals.cpp – 分出了一个 internals partition
module roman:internals;
unsigned int char_to_roman(char c)
{
// Same switch statement as before...
}

// ---------------------------------------------------------------------------------
// from_roman.cpp – 实现了 from_roman() 函数
module roman;
import :internals; // 注意！不是 &#39;import roman:internals;&#39;!
unsigned int from_roman(std::string_view roman)
{
  // Same as before... (uses char_to_roman(char) function from the :internals partition)
}
使用 partition 时需要注意：

一个 partition name 跟一个 partition 文件一一对应。

接上条，即便是对于 interface partition 和逻辑上与之对应的 implementation partition，也不能在声明 module 时使用相同的 partition name！在上面的例子中，在 from_roman.cpp 中声明 module roman:from 是非法的。

导入 partition 的格式类似于 import :internals，不是 import roman:internals! 注意前面没有 module name，因为一个 module 的 partition 永远只能在这个 module 内部被调用。

module implementation partition 不能导出，只有 module interface partition 可以在 primary module interface 中被导出，也必须被导出！

鑨 · 发表于 2025-6-20 22:03:30

求沙发

C++20 中 module 的使用方法

浏览过的版块