|
Metadata-Version: 2.1 |
|
Name: idna |
|
Version: 3.6 |
|
Summary: Internationalized Domain Names in Applications (IDNA) |
|
Author-email: Kim Davies <kim+pypi@gumleaf.org> |
|
Requires-Python: >=3.5 |
|
Description-Content-Type: text/x-rst |
|
Classifier: Development Status :: 5 - Production/Stable |
|
Classifier: Intended Audience :: Developers |
|
Classifier: Intended Audience :: System Administrators |
|
Classifier: License :: OSI Approved :: BSD License |
|
Classifier: Operating System :: OS Independent |
|
Classifier: Programming Language :: Python |
|
Classifier: Programming Language :: Python :: 3 |
|
Classifier: Programming Language :: Python :: 3 :: Only |
|
Classifier: Programming Language :: Python :: 3.5 |
|
Classifier: Programming Language :: Python :: 3.6 |
|
Classifier: Programming Language :: Python :: 3.7 |
|
Classifier: Programming Language :: Python :: 3.8 |
|
Classifier: Programming Language :: Python :: 3.9 |
|
Classifier: Programming Language :: Python :: 3.10 |
|
Classifier: Programming Language :: Python :: 3.11 |
|
Classifier: Programming Language :: Python :: 3.12 |
|
Classifier: Programming Language :: Python :: Implementation :: CPython |
|
Classifier: Programming Language :: Python :: Implementation :: PyPy |
|
Classifier: Topic :: Internet :: Name Service (DNS) |
|
Classifier: Topic :: Software Development :: Libraries :: Python Modules |
|
Classifier: Topic :: Utilities |
|
Project-URL: Changelog, https://github.com/kjd/idna/blob/master/HISTORY.rst |
|
Project-URL: Issue tracker, https://github.com/kjd/idna/issues |
|
Project-URL: Source, https://github.com/kjd/idna |
|
|
|
Internationalized Domain Names in Applications (IDNA) |
|
===================================================== |
|
|
|
Support for the Internationalized Domain Names in |
|
Applications (IDNA) protocol as specified in `RFC 5891 |
|
<https://tools.ietf.org/html/rfc5891>`_. This is the latest version of |
|
the protocol and is sometimes referred to as “IDNA 2008”. |
|
|
|
This library also provides support for Unicode Technical |
|
Standard 46, `Unicode IDNA Compatibility Processing |
|
<https://unicode.org/reports/tr46/>`_. |
|
|
|
This acts as a suitable replacement for the “encodings.idna” |
|
module that comes with the Python standard library, but which |
|
only supports the older superseded IDNA specification (`RFC 3490 |
|
<https://tools.ietf.org/html/rfc3490>`_). |
|
|
|
Basic functions are simply executed: |
|
|
|
.. code-block:: pycon |
|
|
|
>>> import idna |
|
>>> idna.encode('ドメイン.テスト') |
|
b'xn--eckwd4c7c.xn--zckzah' |
|
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) |
|
ドメイン.テスト |
|
|
|
|
|
Installation |
|
------------ |
|
|
|
This package is available for installation from PyPI: |
|
|
|
.. code-block:: bash |
|
|
|
$ python3 -m pip install idna |
|
|
|
|
|
Usage |
|
----- |
|
|
|
For typical usage, the ``encode`` and ``decode`` functions will take a |
|
domain name argument and perform a conversion to A-labels or U-labels |
|
respectively. |
|
|
|
.. code-block:: pycon |
|
|
|
>>> import idna |
|
>>> idna.encode('ドメイン.テスト') |
|
b'xn--eckwd4c7c.xn--zckzah' |
|
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah')) |
|
ドメイン.テスト |
|
|
|
You may use the codec encoding and decoding methods using the |
|
``idna.codec`` module: |
|
|
|
.. code-block:: pycon |
|
|
|
>>> import idna.codec |
|
>>> print('домен.испытание'.encode('idna2008')) |
|
b'xn--d1acufc.xn--80akhbyknj4f' |
|
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna2008')) |
|
домен.испытание |
|
|
|
Conversions can be applied at a per-label basis using the ``ulabel`` or |
|
``alabel`` functions if necessary: |
|
|
|
.. code-block:: pycon |
|
|
|
>>> idna.alabel('测试') |
|
b'xn--0zwm56d' |
|
|
|
Compatibility Mapping (UTS |
|
+++++++++++++++++++++++++++++++ |
|
|
|
As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895>`_, the |
|
IDNA specification does not normalize input from different potential |
|
ways a user may input a domain name. This functionality, known as |
|
a “mapping”, is considered by the specification to be a local |
|
user-interface issue distinct from IDNA conversion functionality. |
|
|
|
This library provides one such mapping that was developed by the |
|
Unicode Consortium. Known as `Unicode IDNA Compatibility Processing |
|
<https://unicode.org/reports/tr46/>`_, it provides for both a regular |
|
mapping for typical applications, as well as a transitional mapping to |
|
help migrate from older IDNA 2003 applications. |
|
|
|
For example, “Königsgäßchen” is not a permissible label as *LATIN |
|
CAPITAL LETTER K* is not allowed (nor are capital letters in general). |
|
UTS 46 will convert this into lower case prior to applying the IDNA |
|
conversion. |
|
|
|
.. code-block:: pycon |
|
|
|
>>> import idna |
|
>>> idna.encode('Königsgäßchen') |
|
... |
|
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed |
|
>>> idna.encode('Königsgäßchen', uts46=True) |
|
b'xn--knigsgchen-b4a3dun' |
|
>>> print(idna.decode('xn--knigsgchen-b4a3dun')) |
|
königsgäßchen |
|
|
|
Transitional processing provides conversions to help transition from |
|
the older 2003 standard to the current standard. For example, in the |
|
original IDNA specification, the *LATIN SMALL LETTER SHARP S* (ß) was |
|
converted into two *LATIN SMALL LETTER S* (ss), whereas in the current |
|
IDNA specification this conversion is not performed. |
|
|
|
.. code-block:: pycon |
|
|
|
>>> idna.encode('Königsgäßchen', uts46=True, transitional=True) |
|
'xn--knigsgsschen-lcb0w' |
|
|
|
Implementers should use transitional processing with caution, only in |
|
rare cases where conversion from legacy labels to current labels must be |
|
performed (i.e. IDNA implementations that pre-date 2008). For typical |
|
applications that just need to convert labels, transitional processing |
|
is unlikely to be beneficial and could produce unexpected incompatible |
|
results. |
|
|
|
``encodings.idna`` Compatibility |
|
++++++++++++++++++++++++++++++++ |
|
|
|
Function calls from the Python built-in ``encodings.idna`` module are |
|
mapped to their IDNA 2008 equivalents using the ``idna.compat`` module. |
|
Simply substitute the ``import`` clause in your code to refer to the new |
|
module name. |
|
|
|
Exceptions |
|
---------- |
|
|
|
All errors raised during the conversion following the specification |
|
should raise an exception derived from the ``idna.IDNAError`` base |
|
class. |
|
|
|
More specific exceptions that may be generated as ``idna.IDNABidiError`` |
|
when the error reflects an illegal combination of left-to-right and |
|
right-to-left characters in a label; ``idna.InvalidCodepoint`` when |
|
a specific codepoint is an illegal character in an IDN label (i.e. |
|
INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is |
|
illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ |
|
but the contextual requirements are not satisfied.) |
|
|
|
Building and Diagnostics |
|
------------------------ |
|
|
|
The IDNA and UTS 46 functionality relies upon pre-calculated lookup |
|
tables for performance. These tables are derived from computing against |
|
eligibility criteria in the respective standards. These tables are |
|
computed using the command-line script ``tools/idna-data``. |
|
|
|
This tool will fetch relevant codepoint data from the Unicode repository |
|
and perform the required calculations to identify eligibility. There are |
|
three main modes: |
|
|
|
* ``idna-data make-libdata``. Generates ``idnadata.py`` and |
|
``uts46data.py``, the pre-calculated lookup tables used for IDNA and |
|
UTS 46 conversions. Implementers who wish to track this library against |
|
a different Unicode version may use this tool to manually generate a |
|
different version of the ``idnadata.py`` and ``uts46data.py`` files. |
|
|
|
* ``idna-data make-table``. Generate a table of the IDNA disposition |
|
(e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix |
|
B.1 of RFC 5892 and the pre-computed tables published by `IANA |
|
<https://www.iana.org/>`_. |
|
|
|
* ``idna-data U+0061``. Prints debugging output on the various |
|
properties associated with an individual Unicode codepoint (in this |
|
case, U+0061), that are used to assess the IDNA and UTS 46 status of a |
|
codepoint. This is helpful in debugging or analysis. |
|
|
|
The tool accepts a number of arguments, described using ``idna-data |
|
-h``. Most notably, the ``--version`` argument allows the specification |
|
of the version of Unicode to be used in computing the table data. For |
|
example, ``idna-data --version 9.0.0 make-libdata`` will generate |
|
library data against Unicode 9.0.0. |
|
|
|
|
|
Additional Notes |
|
---------------- |
|
|
|
* **Packages**. The latest tagged release version is published in the |
|
`Python Package Index <https://pypi.org/project/idna/>`_. |
|
|
|
* **Version support**. This library supports Python 3.5 and higher. |
|
As this library serves as a low-level toolkit for a variety of |
|
applications, many of which strive for broad compatibility with older |
|
Python versions, there is no rush to remove older interpreter support. |
|
Removing support for older versions should be well justified in that the |
|
maintenance burden has become too high. |
|
|
|
* **Python 2**. Python 2 is supported by version 2.x of this library. |
|
While active development of the version 2.x series has ended, notable |
|
issues being corrected may be backported to 2.x. Use "idna<3" in your |
|
requirements file if you need this library for a Python 2 application. |
|
|
|
* **Testing**. The library has a test suite based on each rule of the |
|
IDNA specification, as well as tests that are provided as part of the |
|
Unicode Technical Standard 46, `Unicode IDNA Compatibility Processing |
|
<https://unicode.org/reports/tr46/>`_. |
|
|
|
* **Emoji**. It is an occasional request to support emoji domains in |
|
this library. Encoding of symbols like emoji is expressly prohibited by |
|
the technical standard IDNA 2008 and emoji domains are broadly phased |
|
out across the domain industry due to associated security risks. For |
|
now, applications that need to support these non-compliant labels |
|
may wish to consider trying the encode/decode operation in this library |
|
first, and then falling back to using `encodings.idna`. See `the Github |
|
project <https://github.com/kjd/idna/issues/18>`_ for more discussion. |
|
|
|
|