Use the TDLib C interface

Jan 1, 2022 · 1656 words · 8 minute read

Backend C TDLib

TDLib is native for C++ only: any other languages should use a binding if available, or use the TDLib JSON interface. The former is easy to use, but is only available in a few languages. The later one is available in all languages (since it’s just a C dynamic library operating JSON strings), but is not as native as the original TDLib interface and is slow. Fortunately, TDLib offers a C interface, but it is rarely used, hard to use and has some bugs. In the last few months, I did some Telegram Bots and Userbots using the TDLib C interface (abbreviated as tdc), and I became familiar with it. However, I found that there are few resources about tdc online (actually there isn’t any), so I want to introduce this interface to the public. In this article, I am going to briefly explain how this interface works, and how to use it.

Disclaimer:

I don’t have much experience with TDLib and C++. Some information may not be accurate.

TDLib changes quickly. Information in this article may be outdated.

Environment setup #

Before coding, I strongly recommend you using CLion or any other IDE. It will be hard to develop and debug without an IDE with referencing support.

Also, this tutorial will be based on CMake, the native build system for TDLib and its projects. TDLib works fine with CMake, and you can have your project run with minimal effort.

TDLib C interface is not hard to use once you get familiar with it, but be prepared to write ugly OOP in C.

Last note: TDLib can be linked statically only, and the ABI and API are not stable. Be prepared to wait for minutes to link a 300MiB+ binary each time you build.

I will use CMake 3.22.1, C99 and the latest TDLib (master branch).

I will also use ASAN (Address Sanitizer) for debug builds. There is a special flag to pass to TDLib in order for ASAN to work properly. ASAN is optional, but it will eliminate lots of memory errors.

Build System #

TDLib is generally included to use as a Git submodule in your repository, for example, td/. Then, you should use add_subdirectory(td) in your CMakeLists.txt to include the targets of TDLib. Finally, link the tdc target to your target, and you are ready to go.

For example, your project structure is:

./main.c		# Your source code
./CMakeLists.txt	# Your CMakeLists
./td/			# Cloned TDLib repository
./td/.git
./td/CMakeLists.txt	# TDLib's CMakeLists

Then, your CMakeLists.txt could appear:

cmake_minimum_required(VERSION 3.0)
# Remember to add CXX to link against the C++ stdlib.
project(example VERSION 1.0 LANGUAGES C CXX)

set(CMAKE_C_STANDARD 11)

# Optional ASAN configuration for TDLib. See below.
IF (CMAKE_BUILD_TYPE MATCHES Debug)
    add_compile_definitions(TD_USE_ASAN)
ENDIF (CMAKE_BUILD_TYPE MATCHES Debug)

# Include TDLib directory and its CMakeLists.txt
add_subdirectory(td)

# Optional ASAN configuration for your codes. See below.
set(CMAKE_C_FLAGS_DEBUG
        "${CMAKE_C_FLAGS_DEBUG} -g3 -O0 -fsanitize=address")
set(CMAKE_EXE_LINKER_FLAGS_DEBUG
        "${CMAKE_EXE_LINKER_FLAGS_DEBUG} -fsanitize=address")

add_executable(example main.c)
target_include_directories(example PUBLIC "${PROJECT_BINARY_DIR}")

# Statically link against tdc
target_link_libraries(example PRIVATE tdc)

Compile it, and you will have your first TDLib build.

Address Sanitizer #

Before doing anything for real, I would like to enable Address Sanitizer for debug builds. Address Sanitizer will detect memory errors at runtime, which is of great help of C developers. Enabling Address Sanitizer is as simple as putting -fsanitize=address to your CFLAGS and LDFLAGS, as illustrated in the following CMakeLists gist:

set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -g3 -O0 -fsanitize=address")
set(CMAKE_EXE_LINKER_FLAGS_DEBUG "${CMAKE_EXE_LINKER_FLAGS_DEBUG} -fsanitize=address")

However, you must do this after add_subdirectory(td), or the flags will apply to TDLib, and you will get a bunch of errors.

Moreover, TDLib needs a special flag to work with Address Sanitizer. Otherwise, you will get some warnings from Address Sanitizer complaining about memory leaks, just as I described in issue#1733. They aren’t really leaks, but a false positive report from Address Sanitizer, because we didn’t specify the flags for TDLib. To do so, we need to use add_compile_definitions(TD_USE_ASAN) to supress the warnings:

# Add these lines before add_subdirectory(td) to get them applied to TDLib.
IF (CMAKE_BUILD_TYPE MATCHES Debug)
    add_compile_definitions(TD_USE_ASAN)
ENDIF (CMAKE_BUILD_TYPE MATCHES Debug)

Basic OOP structure of tdc #

Now, let’s learn how the C interface works. Your program should interact with td/telegram/td_c_client.h, which is implemented in td_c_client.cpp. It is a C wrapper for some basic C++ TDLib functions (e.g. init / send / receive). If you open them in CLion without building the project the first time, you will find lots of undefined reference errors, because a lot of definitions (i.e. structs and functions) are generated during compile time (using tl_generate_c). During build time, td_tdc_api.h and td_tdc_api.cpp are generated. They contains the C-style structs and wrapper functions for C++ TDLib classes and functions.

TDLib is object oriented. Every struct in TDLib is an object, and they may derive from a “parent” object. The root representation of the hierarchy is struct TdObject, which has two integers: ID and refcnt. ID represents the type of the object, and refcnt is used for memory deallocation. All object structs have the first two members same as TdObject, so their pointers can be converted without losing data.

For example, when you call TdCClientSend(int, struct TdRequest), the TdRequest struct has a member of struct TdFunction, which is an object. TdFunction has an ID and a refcnt at the beginning as well, and their derived functions may add more members after these two integer members for function-specific uses (e.g. parameters).

In order to create an object, you need to use TdCreateObjectXXX() functions. They will allocate a new struct of your type, set its ID to the ID of the type, and fill in optional data. For example, GetOption inherits function, which inherits object, has the following definition:

struct TdGetOption {
  int ID;
  int refcnt;
  char *name_;
};

/* For your reference: */
struct TdFunction {
  int ID;
  int refcnt;
};
struct TdObject {
  int ID;
  int refcnt;
};

You can create a GetOption instance by:

struct TdGetOption *getOpt = TdCreateObjectGetOption("version");

Here, getOpt->ID is CODE_GetOption, and its name_ is a copy of the argument you specified. If I remembered correctly, everything goes into TDLib will be copied (correct me if I am wrong).

You can then safely convert the pointers:

struct TdFunction *func = (struct TdFunction *) getOpt;
struct TdObject *obj = (struct TdObject *) func;

When deallocating objects, you need to use TdDestroyObjectXX(struct TdXX *ptr). There are destroy functions for every level of objects: you can pass a generic TdObject pointer into TdDestroyObjectObject(struct TdObject *object), or a specific TdGetOption pointer into TdDestroyObjectGetOption(struct TdGetOption *getOption). In the former case, td_tdc_api.cpp will detect the type using the object ID, and call the specific destroy function for that type. In the later case, TDLib will directly call the specific destroy function for that type. If you provided an incompatible pointer into the destroy function, for example, a Chat object into the TdDestroyObjectFunction function, it will print an error log (Unknown constructor ...). In the destroy functions, TDLib will automatically free the additional data (e.g. name_ in the TdGetOption struct) according to the refcnt member.

TDLib main loop #

If you used the any TDLib bindings or the TDLib JSON interface before, you should be familiar with its main loop. After creating the TDLib client, you need to call the receive function to wait for new events. Whenever you need to send a request, you should call the send function with a request ID. Then, the result will be given back to you as an event, with the corresponding request ID. The TDLib C interface has no exception on that style.

You should use struct TdResponse TdCClientReceive(double timeout) to wait for new events. The response TdResponse is not a TDLib object, so there is no point to free it. It has the following definition:

struct TdResponse {
  long long request_id;
  int client_id;
  struct TdObject *object;
};

If the request_id member is zero, the object member is an update (TdUpdate). Otherwise, it is the same as your request ID. You should free the object by yourself, preferably TdDestroyObjectObject(struct TdObject *object).

To make requests, use the void TdCClientSend(int client_id, struct TdRequest request) function. TdRequest is not a TDLib object as well, but a struct with request_id and function:

struct TdRequest {
  long long request_id;
  struct TdFunction *function;
};

The value of request_id is arbitrary, and the function is a TdUpdate object.

To obtain a client ID, use int TdCClientCreateId(). There are no destroy functions for client IDs, because once you got the authorization state closed, all resources are already freed.

Finally, I suggest you to make request_id increment, and create a map storing request IDs and callbacks. This gives you the maximum flexibility of making requests and getting results. You could find my implementation at tdutils.c.

Foot note: It seems that after creating the client ID, you must send a GetOption(version) request to it, or you won’t get any updates. I don’t know why.

Wrap Up #

To conclude, here are the main points of this article:

Use CMake, and put TDLib in a subdirectory, then add_subdirectory and link against tdc target.
To use ASAN, set TD_USE_ASAN.
TDLib is object oriented. All objects inherit struct TdObject, and they use ID to identify their types.
Use TdCreateObjectXXX functions to allocate objects, and use TdDestroyObjectXXX to destroy them. You can use TdDestroyObjectObject if you don’t know the type.
Use TdCClientCreateId to create a TDLib instance.
Use TdCClientSend and TdCClientReceive to make requests and wait for results or updates. If TdResponse.request_id is zero, it is an update.
All values given to TdCreateObjectXXX functions are copied.
You need to manually free TdResponse.object.
Make a getOption(version) request after TdCClientCreateId.
Write a function to make requests. Make request IDs increment. Create a map to register request IDs and callbacks. This simplifies your development.
Be prepared for long link time :D

You can check out my Mutebot (GitHub Mirror) project. It is a simple TDLib C based bot program.

If you have any questions, create an issue. The TDLib developers will help you out.