Quickstart

This guide helps you install CLTK and run working examples using each backend.

Requirements

Python 3.13+

Install

Base library:
- pip: pip install cltk
Optional extras (choose any):
- Stanza: pip install "cltk[stanza]"
- OpenAI: pip install "cltk[openai]"
- Mistral: pip install "cltk[mistral]"
- Ollama (local or cloud): pip install "cltk[ollama]"
You can combine extras, e.g. pip install "cltk[openai,stanza,ollama,mistral]".

Environment Variables

Remote LLM require an API key for access. To set an environment variable, do one of the following:

Set environment variables directly through the shell: export OPENAI_API_KEY='<YOUR-SECRET-KEY>').
Setting the variable inside your script: os.environ["OPENAI_API_KEY"] = "<YOUR-SECRET-KEY>"
Important: Do not do this if you need to share your code publicly.
Putting them in a .env file located in the directory from which you run your code:

OPENAI_API_KEY=<YOUR-SECRET-KEU>

Minimal Examples

Stanza

When NLP() is called without specifying the backend parameter, it defaults to executing Stanza-specific Pipelines.
Install cltk[stanza].
Stanza only supports the labeling of morphology, dependency syntax, and lemmatization.
Language models (which will be installed by as needed).
See Languages for languages supported by Stanza models.

from cltk import NLP
nlp = NLP("lati1261", backend="stanza", suppress_banner=True)
doc = nlp.analyze("Gallia est omnis divisa in partes tres.")
for w in doc.words[:10]:
    print(w.string, getattr(w.upos, "tag", None), w.lemma)

OpenAI

Install cltk[openai].
Requires OPENAI_API_KEY.
Defaults to model gpt-5-mini if not specified.

import os
os.environ["OPENAI_API_KEY"] = "..."  # or set in your shell/.env

from cltk import NLP
nlp = NLP("lati1261", backend="openai", suppress_banner=True)
doc = nlp.analyze("Gallia est omnis divisa in partes tres.")
print("FORM\tLEMMA\tUPOS\tFEATS")
for w in doc.words:
    upos = getattr(w.upos, "tag", "_")
    feats = "_"
    if getattr(w, "features", None) and getattr(w.features, "features", None):
        items = []
        for f in w.features.features:
            if getattr(f, "key", None) and getattr(f, "value", None):
                items.append(f"{f.key}={f.value}")
        feats = "|".join(items) if items else "_"
    print(f"{w.string}\t{w.lemma}\t{upos}\t{feats}")

Ollama

Ollama Local

Install cltk[ollama].
Install the Ollama server and run on http://127.0.0.1:11434).
Defaults to model llama3.1:8b if not specified. You can pass any available model string.
Note: You may override the Ollama host, port, and model. See Advanced Configuration

from cltk import NLP
nlp = NLP("lati1261", backend="ollama", suppress_banner=True)
doc = nlp.analyze("Gallia est omnis divisa in partes tres.")
print(len(doc.words), "tokens")

Ollama Cloud

Install cltk[ollama].
Set OLLAMA_CLOUD_API_KEY and use backend ollama-cloud.

import os
os.environ["OLLAMA_CLOUD_API_KEY"] = "oc-..."  # or set in your shell/.env

from cltk import NLP
nlp = NLP("lati1261", backend="ollama-cloud", suppress_banner=True)
doc = nlp.analyze("Gallia est omnis divisa in partes tres.")
print("usage:", doc.genai_use)

Choosing Language Identifiers

NLP(language_code=...) accepts Glottolog IDs (e.g. "lati1261"), ISO codes (e.g. "lat"), or exact language names (e.g. "Latin"). Internally, CLTK resolves to a Glottolog ID. See [languages.md] for more.