std:: mblen

From cppreference.net

ヘッダーで定義 `<cstdlib>`
int mblen ( const char * s, std:: size_t n ) ;

最初のバイトが s によって指されるマルチバイト文字のサイズをバイト単位で決定します。

s が null ポインタの場合、グローバル変換状態をリセットし、シフトシーケンスが使用されるかどうかを判定します。

この関数は、 std:: mbtowc ( nullptr, s, n ) の呼び出しと等価ですが、 std::mbtowc の変換状態が影響を受けない点が異なります。

mblen への各呼び出しは、内部グローバル変換状態（この関数のみが知る std::mbstate_t 型の静的オブジェクト）を更新します。マルチバイトエンコーディングがシフト状態を使用する場合、バックトラッキングや複数回のスキャンを避けるよう注意が必要です。いずれにせよ、複数のスレッドが同期なしに mblen を呼び出すべきではありません：代わりに std::mbrlen が使用可能です。

パラメータ

s	-	マルチバイト文字へのポインタ
n	-	sで検査可能なバイト数の上限

戻り値

s が null ポインタでない場合、マルチバイト文字に含まれるバイト数を返す。または、 - 1 を返す（ s が指す先頭バイト列が有効なマルチバイト文字を形成しない場合）。または、 0 を返す（ s がナル文字 ' \0 ' を指している場合）。

s がヌルポインタの場合、内部の変換状態を初期シフト状態を表すようにリセットし、現在のマルチバイトエンコーディングが状態依存でない場合（シフトシーケンスを使用しない）は 0 を返し、現在のマルチバイトエンコーディングが状態依存の場合（シフトシーケンスを使用する）は非ゼロの値を返します。

例

このコードを実行

#include <clocale>
#include <cstdlib>
#include <iomanip>
#include <iostream>
#include <stdexcept>
#include <string_view>
// マルチバイト文字列の文字数は mblen() の合計値
// 注意: より単純なアプローチは std::mbstowcs(nullptr, s.c_str(), s.size())
std::size_t strlen_mb(const std::string_view s)
{
    std::mblen(nullptr, 0); // 変換状態をリセット
    std::size_t result = 0;
    const char* ptr = s.data();
    for (const char* const end = ptr + s.size(); ptr < end; ++result)
    {
        const int next = std::mblen(ptr, end - ptr);
        if (next == -1)
            throw std::runtime_error("strlen_mb(): conversion error");
        ptr += next;
    }
    return result;
}
void dump_bytes(const std::string_view str)
{
    std::cout << std::hex << std::uppercase << std::setfill('0');
    for (unsigned char c : str)
        std::cout << std::setw(2) << static_cast<int>(c) << ' ';
    std::cout << std::dec << '\n';
}
int main()
{
    // mblen() が UTF-8 マルチバイトエンコーディングで動作するように設定
    std::setlocale(LC_ALL, "en_US.utf8");
    // UTF-8 ナローマルチバイトエンコーディング
    const std::string_view str = "z\u00df\u6c34\U0001f34c"; // または u8"zß水🍌"
    std::cout << std::quoted(str) << " は " << strlen_mb(str)
              << " 文字ですが、バイト数は " << str.size() << " バイトです: ";
    dump_bytes(str);
}

出力例:

"zß水🍌" is 4 characters, but as much as 10 bytes: 7A C3 9F E6 B0 B4 F0 9F 8D 8C

Compiler support
Freestanding and hosted
Language
Standard library
Standard library headers
Named requirements
Feature test macros (C++20)
Language support library
Concepts library (C++20)
Diagnostics library
Memory management library
Metaprogramming library (C++11)
General utilities library
Containers library
Iterators library
Ranges library (C++20)
Algorithms library
Strings library
Text processing library
Numerics library
Date and time library
Input/output library
Filesystem library (C++17)
Concurrency support library (C++11)
Execution control library (C++26)
Technical specifications
Symbols index
External libraries

mbtowc	次のマルチバイト文字をワイド文字に変換する (関数)
mbrlen	状態を指定して次のマルチバイト文字のバイト数を返す (関数)
Cドキュメント for mblen

cppreference.net

Namespaces

Variants

std:: mblen

目次

注記

パラメータ

戻り値

例

関連項目