QMK supports three different methods for enabling Unicode input and adding Unicode characters to your keymap. Each has its pros and cons in terms of flexibility and ease of use. Choose the one that best fits your use case.
The Basic method should be enough for most users. However, if you need a wider range of supported characters (including emoji, rare symbols etc.), you should use Unicode Map.
<br>
### 1.1. Basic Unicode :id=basic-unicode
The easiest to use method, albeit somewhat limited. It stores Unicode characters as keycodes in the keymap itself, so it only supports code points up to `0x7FFF`. This covers characters for most modern languages (including East Asian), as well as symbols, but it doesn't cover emoji.
Then add `UC(c)` keycodes to your keymap, where _c_ is the code point of the desired character (preferably in hexadecimal, up to 4 digits long). For example, `UC(0x40B)` will output [Ћ](https://unicode-table.com/en/040B/), and `UC(0x30C4)` will output [ツ](https://unicode-table.com/en/30C4).
In addition to standard character ranges, this method also covers emoji, ancient scripts, rare symbols etc. In fact, all possible code points (up to `0x10FFFF`) are supported. Here, Unicode characters are stored in a separate mapping table. You need to maintain a `unicode_map` array in your keymap file, which may contain at most 16384 entries.
Then add `X(i)` keycodes to your keymap, where _i_ is the desired character's index in the mapping table. This can be a numeric value, but it's recommended to keep the indices in an enum and access them by name.
Characters often come in lower and upper case pairs, such as å and Å. To make inputting these characters easier, you can use `XP(i, j)` in your keymap, where _i_ and _j_ are the mapping table indices of the lower and upper case character, respectively. If you're holding down Shift or have Caps Lock turned on when you press the key, the second (upper case) character will be inserted; otherwise, the first (lower case) version will appear.
This is most useful when creating a keymap for an international layout with special characters. Instead of having to put the lower and upper case versions of a character on separate keys, you can have them both on the same key by using `XP()`. This helps blend Unicode keys in with regular alphas.
Due to keycode size constraints, _i_ and _j_ can each only refer to one of the first 128 characters in your `unicode_map`. In other words, 0 ≤ _i_ ≤ 127 and 0 ≤ _j_ ≤ 127. This is enough for most use cases, but if you'd like to customize the index calculation, you can override the [`unicodemap_index()`](https://github.com/qmk/qmk_firmware/blob/71f640d47ee12c862c798e1f56392853c7b1c1a8/quantum/process_keycode/process_unicodemap.c#L36) function. This also allows you to, say, check Ctrl instead of Shift/Caps.
This method also supports all possible code points. As with the Unicode Map method, you need to maintain a mapping table in your keymap file. However, there are no built-in keycodes for this feature — you have to create a custom keycode or function that invokes this functionality.
By default, each table entry may be up to 3 code points long. This number can be changed by adding `#define UCIS_MAX_CODE_POINTS n` to your `config.h` file.
To use UCIS input, call `ucis_start()`. Then, type the mnemonic for the character (such as "rofl") and hit Space, Enter or Esc. QMK should erase the "rofl" text and insert the laughing emoji.
*`void ucis_start_user(void)`– This runs when you call the "start" function, and can be used to provide feedback. By default, it types out a keyboard emoji.
*`void ucis_success(uint8_t symbol_index)`– This runs when the input has matched something and has completed. By default, it doesn't do anything.
*`void ucis_symbol_fallback (void)`– This runs when the input doesn't match anything. By default, it falls back to trying that input as a Unicode code.
You can find the default implementations of these functions in [`process_ucis.c`](https://github.com/qmk/qmk_firmware/blob/master/quantum/process_keycode/process_ucis.c).
Unicode input in QMK works by inputting a sequence of characters to the OS, sort of like a macro. Unfortunately, the way this is done differs for each platform. Specifically, each platform requires a different combination of keys to trigger Unicode input. Therefore, a corresponding input mode has to be set in QMK.
To enable, go to _System Preferences > Keyboard > Input Sources_, add _Unicode Hex Input_ to the list (it's under _Other_), then activate it from the input dropdown in the Menu Bar.
By default, this mode uses the left Option key (`KC_LALT`) for Unicode input, but this can be changed by defining [`UNICODE_KEY_MAC`](#input-key-configuration) with a different keycode.
By default, this mode uses Ctrl+Shift+U (`LCTL(LSFT(KC_U))`) to start Unicode input, but this can be changed by defining [`UNICODE_KEY_LNX`](#input-key-configuration) with a different keycode. This might be required for IBus versions ≥1.5.15, where Ctrl+Shift+U behavior is consolidated into Ctrl+Shift+E.
Users who wish support in non-GTK apps without IBus may need to resort to a more indirect method, such as creating a custom keyboard layout ([more on this method](#custom-linux-layout)).
To enable, create a registry key under `HKEY_CURRENT_USER\Control Panel\Input Method` of type `REG_SZ` called `EnableHexNumpad` and set its value to `1`. This can be done from the Command Prompt by running `reg add "HKCU\Control Panel\Input Method" -v EnableHexNumpad -t REG_SZ -d 1` with administrator privileges. Reboot afterwards.
* **`UNICODE_MODE_BSD`**: _(non implemented)_ Unicode input under BSD. Not implemented at this time. If you're a BSD user and want to help add support for it, please [open an issue on GitHub](https://github.com/qmk/qmk_firmware/issues).
* **`UNICODE_MODE_WINCOMPOSE`**: Windows Unicode input using [WinCompose](https://github.com/samhocevar/wincompose). As of v0.9.0, supports code points up to `0x10FFFF` (all possible code points).
To enable, install the [latest release](https://github.com/samhocevar/wincompose/releases/latest). Once installed, WinCompose will automatically run on startup. This mode works reliably under all version of Windows supported by the app.
By default, this mode uses right Alt (`KC_RALT`) as the Compose key, but this can be changed in the WinCompose settings and by defining [`UNICODE_KEY_WINC`](#input-key-configuration) with a different keycode.
This example sets the board's default input mode to `UNICODE_MODE_LINUX`. You can replace this with `UNICODE_MODE_MACOS`, `UNICODE_MODE_WINCOMPOSE`, or any of the other modes listed [above](#input-modes). The board will automatically use the selected mode on startup, unless you manually switch to another mode (see [below](#keycodes)).
Note that the values are separated by commas. The board will remember the last used input mode and will continue using it on next power-up. You can disable this and force it to always start with the first mode in the list by adding `#define UNICODE_CYCLE_PERSIST false` to your `config.h`.
#### Keycodes
You can switch the input mode at any time by using the following keycodes. Adding these to your keymap allows you to quickly switch to a specific input mode, including modes not listed in `UNICODE_SELECTED_MODES`.
|`QK_UNICODE_MODE_NEXT` |`UC_NEXT`|Next in list |Cycle through selected modes, reverse direction when Shift is held |
|`QK_UNICODE_MODE_PREVIOUS` |`UC_PREV`|Prev in list |Cycle through selected modes in reverse, forward direction when Shift is held|
|`QK_UNICODE_MODE_MACOS` |`UC_MAC` |`UNICODE_MODE_MACOS` |Switch to macOS input |
|`QK_UNICODE_MODE_LINUX` |`UC_LINX`|`UNICODE_MODE_LINUX` |Switch to Linux input |
|`QK_UNICODE_MODE_WINDOWS` |`UC_WIN` |`UNICODE_MODE_WINDOWS` |Switch to Windows input |
|`QK_UNICODE_MODE_BSD` |`UC_BSD` |`UNICODE_MODE_BSD` |Switch to BSD input _(not implemented)_ |
|`QK_UNICODE_MODE_WINCOMPOSE`|`UC_WINC`|`UNICODE_MODE_WINCOMPOSE`|Switch to Windows input using WinCompose |
|`QK_UNICODE_MODE_EMACS` |`UC_EMAC`|`UNICODE_MODE_EMACS` |Switch to emacs (`C-x-8 RET`) |
You can also switch the input mode by calling `set_unicode_input_mode(x)` in your code, where _x_ is one of the above input mode constants (e.g. `UNICODE_MODE_LINUX`).
?> Using `UNICODE_SELECTED_MODES` is preferable to calling `set_unicode_input_mode()` in `matrix_init_user()` or similar functions, since it's better integrated into the Unicode system and has the added benefit of avoiding unnecessary writes to EEPROM.
If you have the [Audio feature](feature_audio.md) enabled on the board, you can set melodies to be played when you press the above keys. That way you can have some audio feedback when switching input modes.
For instance, you can add these definitions to your `config.h` file:
The functions for starting and finishing Unicode input on your platform can be overridden locally. Possible uses include customizing input mode behavior if you don't use the default keys, or adding extra visual/audio feedback to Unicode input.
*`void unicode_input_start(void)`– This sends the initial sequence that tells your platform to enter Unicode input mode. For example, it holds the left Alt key followed by Num+ on Windows, and presses the `UNICODE_KEY_LNX` combination (default: Ctrl+Shift+U) on Linux.
*`void unicode_input_finish(void)`– This is called to exit Unicode input mode, for example by pressing Space or releasing the Alt key.
You can find the default implementations of these functions in [`process_unicode_common.c`](https://github.com/qmk/qmk_firmware/blob/master/quantum/process_keycode/process_unicode_common.c).
You can customize the keys used to trigger Unicode input for macOS, Linux and WinCompose by adding corresponding defines to your `config.h`. The default values match the platforms' default settings, so you shouldn't need to change this unless Unicode input isn't working, or you want to use a different key (e.g. in order to free up left or right Alt).
This function is much like `send_string()`, but it allows you to input UTF-8 characters directly. It supports all code points, provided the selected input mode also supports it. Make sure your `keymap.c` file is formatted using UTF-8 encoding.
In `quantum/keymap_extras`, you'll see various language files — these work the same way as the ones for alternative layouts such as Colemak or BÉPO. When you include one of these language headers, you gain access to keycodes specific to that language / national layout. Such keycodes are defined by a 2-letter country/language code, followed by an underscore and a 4-letter abbreviation of the character to which the key corresponds. For example, including `keymap_french.h` and using `FR_UGRV` in your keymap will output `ù` when typed on a system with a native French AZERTY layout.
If the primary system layout you use on your machine is different from US ANSI, using these language-specific keycodes can help your QMK keymaps better match what will actually be output on the screen. However, keep in mind that these keycodes are just aliases for the corresponding default US keycodes under the hood, and that the HID protocol used by keyboards is itself inherently based on US ANSI.
The method does not require Unicode support in the keyboard itself but instead depends on [AutoHotkey](https://autohotkey.com) running in the background.
If you enable the US International layout on the system, it will use punctuation to accent the characters. For instance, typing "\`a" will result in à.
## Software keyboard layout on Linux :id=custom-linux-layout
This method does not require Unicode support on the keyboard itself but instead uses a custom keyboard layout for Xorg. This is how special characters are inserted by regular keyboards. This does not require IBus and works in practically all software. Help on creating a custom layout can be found [here](https://www.linux.com/news/creating-custom-keyboard-layouts-x11-using-xkb/), [here](http://karols.github.io/blog/2013/11/18/creating-custom-keyboard-layouts-for-linux/) and [here](https://wiki.archlinux.org/index.php/X_keyboard_extension). An example of how you could edit the `us` layout to gain 🤣 on `RALT(KC_R)`:
Edit the keyboard layout file `/usr/share/X11/xkb/symbols/us`.
Find the line defining the R key and add an entry to the list, making it look like this:
```
key <AD04> { [ r, R, U1F923 ] };
```
Save the file and run the command `setxkbmap us` to reload the layout.
You can define one custom character for key defined in the layout, and another if you populate the fourth layer. Additional layers up to 8th are also possible.
This method is specific to the computer on which you set the custom layout. The custom keys will be available only when Xorg is running. To avoid accidents, you should always reload the layout using `setxkbmap`, otherwise an invalid layout could prevent you from logging into your system, locking you out.