Inside node: How node is able to require binary modules
When talking about node modules, we mostly refer to “vanilla” JS modules. They are written in plain JavaScript, we can easily access their sources and they are easy to distribute. In summary: they are great! But in some occasions, we are hitting the boundaries of what’s doable with JavaScript in terms of performance, connectivity or platform use. One way to cope with these limitations are native node addons.
Addons are dynamically-linked shared objects written in C++. The require() function can load Addons as ordinary Node.js modules. Addons provide an interface between JavaScript and C/C++ libraries.
But in contrast to plain JS modules, native addons are compiled binaries. So how is possible to seamlessly require a binary module?
Over the years several ways of writing native addons established, with nodes N-API being the latest one. As cited above, native addons are (mostly) written in either C or C++, which opens a set of additional possibilities. We’re able to re-use existing high-performant C or C++ libraries for increased performance or wrap a specific low-level driver while keeping the expressiveness of our language of choice, JavaScript! This combination, the best of both worlds, sound very promising. By building a native node addon we just have to do a
const native_module = require("/my/module.node");
and we have native performance at our hand while writing JavaScript code.
const solution = require(“./investigation.node”)
The first component involved in loading our native addon is the require()
function, which is provided through the CommonJS module loader. We won’t go into all the details of Module loading here, the thing we’re most interested in at the moment is the fact, that require()
will call Module.load()
, providing the path to a *.node
native addon.
Depending on the file extension, Module.load()
will hand off the actual loading process to one of the available extensions. The *.node
extension in lib/internal/modules/cjs/loader.js
looks like this:
// Native extension for .node
Module._extensions['.node'] = function(module, filename) {
if (manifest) {
const content = fs.readFileSync(filename);
const moduleURL = pathToFileURL(filename);
manifest.assertIntegrity(moduleURL, content);
}
// Be aware this doesn't use `content`
return process.dlopen(module, path.toNamespacedPath(filename));
}
process.dlopen
sounds a lot like dlopen(3) - Linux man page, so I guess we’re onto something! process.dlopen
is provided through nodes internalBinding
mechanism, the implementation behind it is located in src/node_binding.cc
.
The heart of this method is a call to env->TryLoadAddon
, which receives a callback to perform the actual loading process.
env->TryLoadAddon(*filename, flags, [&](DLib* dlib) {
…
});
Before we go any further from this point, let’s also take a look at a small sample addon to use for our experiments.
N-API module - Sample application
Instead of building a dedicated N-API sample for this post I’ll refer to a sample a friend of mine built for an introductory talk to N-API development at the MNUG - Munich NodeJS User Group.
This sample provides a native implementation of a square()
function:
module.c
:
#include <node_api.h>
napi_value square(napi_env env, napi_callback_info info) {
napi_value argv[1];
size_t argc = 1;
napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
double value;
napi_get_value_double(env, argv[0], &value);
napi_value result;
napi_create_double(env, value * value, &result);
return result;
}
napi_value init(napi_env env, napi_value exports) {
napi_value square_fn;
napi_create_function(env, NULL, 0, square, NULL, &square_fn);
napi_set_named_property(env, exports, “square”, square_fn);
return exports;
}
NAPI_MODULE(square, init)
index.js
:
//const {square} = require(‘bindings’)(‘square’);
const {square} = require(‘./build/Debug/square.node’);
console.log(square(4));
As we can see, we just require
the compiled *.node
file and are able to call our native square
function.
Inside module.c
, the following things happen:
napi_get_cb_info(env, info, &argc, argv, NULL, NULL);
stores a list of arguments to oursquare
function in an array:
napi_value argv[1];
- Next, we store the first element of this list as a
double
value:
double value;
napi_get_value_double(env, argv[0], &value);
- The result of
square
will be stored in annapi_value
and returned.
napi_value result;
napi_create_double(env, value * value, &result);
return result;
Dynamic Loading
Since node addons just happen to be dynamic shared libraries, the four major requirements to handle dynamic libraries are:
- Opening a library
- Handling possible errors
- Retrieving addresses of symbols
- Closing an opened library
On POSIX systems, these tasks are handled via dlopen, dlerror, dlsym and dlclose. Within node, class DLib
in src/node_binding.h
encapsulates this functionality and if we take a look at its methods, we see that DLib::Open
, DLib::Close
and DLib::GetSymbolAddress
are using mentioned functions.
bool DLib::Open() {
handle_ = dlopen(filename_.c_str(), flags_);
if (handle_ != nullptr) return true;
errmsg_ = dlerror();
return false;
}
void DLib::Close() {
if (handle_ == nullptr) return;
if (libc_may_be_musl()) {
return;
}
int err = dlclose(handle_);
if (err == 0) {
if (has_entry_in_global_handle_map_)
global_handle_map.erase(handle_);
}
handle_ = nullptr;
}
void* DLib::GetSymbolAddress(const char* name) {
return dlsym(handle_, name);
}
For non-POSIX systems wrappers provided by libuv (uv_dlopen
etc.) will be used, but functionality stays the same.
Connecting the strings
Being able to open a library, retrieve symbol addresses and close it again are a first steps to native module loading. However, there are still some things to resolve until we’re able to use our module, which is done in the callback function provided to env->TryLoadAddon
:
[&](DLib* dlib) {
// Skipped
const bool is_opened = dlib->Open();
node_module* mp = thread_local_modpending;
thread_local_modpending = nullptr;
if (!is_opened) {
// Error handling, closing the lib
// Skipped
}
if (mp != nullptr) {
if (mp->nm_context_register_func == nullptr) {
if (env->options()->force_context_aware) {
dlib->Close();
THROW_ERR_NON_CONTEXT_AWARE_DISABLED(env);
return false;
}
}
mp->nm_dso_handle = dlib->handle_;
dlib->SaveInGlobalHandleMap(mp);
} else {
// Skipped
}
// -1 is used for N-API modules
if ((mp->nm_version != -1) && (mp->nm_version != NODE_MODULE_VERSION)) {
// We’re dealing with N-API
// Skipped
}
CHECK_EQ(mp->nm_flags & NM_F_BUILTIN, 0);
// Do not keep the lock while running userland addon loading code.
Mutex::ScopedUnlock unlock(lock);
if (mp->nm_context_register_func != nullptr) {
mp->nm_context_register_func(exports, module, context, mp->nm_priv);
} else if (mp->nm_register_func != nullptr) {
mp->nm_register_func(exports, module, mp->nm_priv);
} else {
dlib->Close();
env->ThrowError("Module has no declared entry point.");
return false;
}
return true;
}
In summary, this rather longish function takes care of following things:
- It opens the native addon via
DLib::Open()
- Handles loading errors
- Accesses the loaded module via
thread_local_modpending
- Calls modules register function
But how is it possible to access module data via thread_local_modpending
? After a call to DLib::Open()
thread_local_modpending
holds the modules address without a call do dlsym
, so there’s got to be an additional loading mechanism.
NAPI_MODULE()
The N-API sample shown earlier contains the following line:
NAPI_MODULE(square, init)
NAPI_MODULE
is a macro defined in src/node_api.h
. It receives a module name and the name of an init function. When expanded, this macro results in the following snippet:
extern “C” {
static napi_module _module = {
1,
flags,
__FILE__,
regfunc,
“square”,
priv,
{0},
};
static void _register_square(void) __attribute__((constructor));
static void _register_square(void) {
napi_module_register(&_module);
}
}
This expanded macro will create a new N-API module:
typedef struct {
int nm_version;
unsigned int nm_flags;
const char* nm_filename;
napi_addon_register_func nm_register_func;
const char* nm_modname;
void* nm_priv;
void* reserved[4];
} napi_module
The interesting part here is what follows:
static void _register_square(void) __attribute__((constructor));
static void _register_square(void) {
napi_module_register(&_module);
}
_register_square
is a function to, well, register our native addon called “square”. It passes our modules address to napi_module_register
, which will wrap its data in a node::node_module
and pass it on to node_module_register
in src/node_binding.cc
. This is how our modules address ends up at thread_local_modpending
.
But still, there’s no call to _register_square
, so how do things add up here?
The answer to this question is the constructor
function attribute present on _register_square
. Like a constructor in object oriented programming, methods with this attribute are called automatically on “instantiation”. As soon as we open our native addon via DLib::Open
, _register_square
will be called automatically and our modules address is stored in thread_local_modpending
before execution continues. The above applies to non-Windows plattforms, but there's a similar concept which applies to Windows.
What’s left to do is calling our modules register function via
mp->nm_context_register_func(exports, module, context, mp->nm_priv);
. The register function returns our modules exports and ultimately we’re able to use our native addon.
Loading native addons turns out to be quite interesting. It requires knowledge about compiler features, platform specific library handling and some time to dig into it, but in the end it has been a fun thing to do!
Let's see what will be the next thing we will take a closer look at!