unihan-db¶
SQLAlchemy models for the UNIHAN CJK character database. unihan-db provides the schema and ORM layer. For the ETL pipeline, see unihan-etl. For end-user character lookups, see cihai.
Quickstart
Install and load UNIHAN data in 5 minutes.
Models & Bootstrap
Table models, bootstrap loader, and data importer.
Contributing
Development setup, code style, release process.
Install¶
$ pip install unihan-db
$ uv add unihan-db
At a glance¶
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from unihan_db.bootstrap import bootstrap_unihan
from unihan_db.tables import Base, Unhn
engine = create_engine("sqlite:///unihan.db")
# Step 1: Create the schema
Base.metadata.create_all(engine)
# Step 2: Bootstrap data from the Unicode consortium
bootstrap_unihan(engine)
# Step 3: Query characters
with Session(engine) as session:
char = session.query(Unhn).filter_by(char="\u597D").first()
if char:
print(char.char, char.ucn)
See Quickstart for the full setup, including bootstrapping data from the Unicode consortium.