Katja v1.0.0 Documentation

Multi-Mode AI Chat Platform with Specialized LLMs

🔒 Privacy-First: Katja is a local-first AI chat platform using Ollama for privacy-preserving inference. All data stays on your machine by default, with optional cloud fallback.

What is Katja?

Katja is a comprehensive AI chat platform featuring 5 specialized modes designed for different use cases. Built with a local-first philosophy, it uses Ollama for privacy-preserving AI inference, ensuring all data remains on your machine unless you opt for cloud fallback.

Why Katja?

Privacy-First: Local inference via Ollama means your conversations never leave your machine
Specialized Models: Purpose-built LLMs for legal reasoning, medical queries, and language learning
Multi-Modal: 5 distinct chat modes optimized for different tasks
Offline-Capable: Works without internet after initial model download
Open Source: Fully transparent codebase with no telemetry
Flexible: Choose between local privacy or cloud performance

Core Capabilities

Language Learning

7 European languages
Real-time grammar correction
Spaced repetition (SM-2)
Export corrections

General Chat

14+ LLM models
LaTeX rendering
Model switching
Structured responses

Legal Chat

Saul-7B specialized model
English law focus
Legal citations
Case law analysis

Medical Chat

Meditron & BioMistral
Maritime medical focus
Examination guidance
IMGS protocol support

Console

Multi-session monitoring
Real-time updates (3s)
Session filtering
Model tracking

Customization

Editable system prompts
Model selection
Settings persistence
CORS configuration

Features Overview

5 Specialized Chat Modes

Mode	Models	Use Case	Key Features
Language Learning	Llama3, Qwen2, GLM4	Practice 7 European languages	Grammar correction, spaced repetition, export corrections
General Chat	14+ models, GPT-4o-mini	Technical discussions, research	LaTeX rendering, model switching, structured responses
Legal Chat	Saul-7B	English law queries	Legal citations, case law, doctrine explanations
Medical Chat	Meditron, BioMistral	Maritime medical guidance	Examination protocols, IMGS compliance, telemedicine prep
Console	All	Monitoring & debugging	Multi-session view, auto-refresh, session filtering

Supported Languages

Language Learning mode supports 7 European languages:

🇸🇮 Slovene
🇩🇪 German
🇮🇹 Italian
🇭🇷 Croatian
🇫🇷 French
🇬🇧 English
🇵🇹 Portuguese

Privacy & Security

Local-First Inference: All LLM processing via Ollama by default
No Cloud Required: Works offline after model download
No Telemetry: Zero data collection or tracking
Session Isolation: Each browser tab gets unique session ID
CORS Security: Configurable allowed origins
Optional Cloud: Explicit opt-in for OpenAI fallback

Installation & Setup

Prerequisites

Ollama: Install from ollama.ai
Node.js 18+: For frontend
Python 3.10+: For backend
Git: To clone repository
OpenAI API Key (Optional): For cloud fallback

Pull Recommended Models

# General-purpose chat
ollama pull llama3:8b-instruct-q4_K_M

# English law (Saul 7B)
ollama pull adrienbrault/saul-instruct-v1:Q4_K_M

# Medical (Meditron)
ollama pull meditron-7b

# Biomedical (BioMistral)
ollama pull biomistral-7b

# Multi-lingual options
ollama pull qwen2:7b
ollama pull glm4

Quick Start (Linux)

git clone https://github.com/SL-Mar/Katja.git
cd Katja
chmod +x launch-katja.sh
./launch-katja.sh

This launches all services:

Language Learning: http://localhost:3000
General Chat: http://localhost:3000/chat
Legal Chat: http://localhost:3000/legal-chat
Medical Chat: http://localhost:3000/medical-chat
Console: http://localhost:3000/console
Settings: http://localhost:3000/settings
API Docs: http://localhost:8000/docs

Manual Setup

Backend Setup

cd backend
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your settings

# Start backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend Setup

cd frontend
npm install
npm run dev

Environment Configuration

Edit backend/.env file:

# LLM Provider (local = Ollama, openai = fallback)
LLM_PROVIDER=local

# Ollama settings
OLLAMA_MODEL=llama3:8b-instruct-q4_K_M
OLLAMA_BASE_URL=http://localhost:11434

# Optional: OpenAI fallback
# OPENAI_API_KEY=your_api_key_here

# CORS origins (comma-separated)
ALLOWED_ORIGINS=http://localhost:3000,http://localhost:3001,http://localhost:5173

⚠️ First Run: Allow 30-60 seconds for the SQLite database to initialize on first startup. Subsequent starts are instant.

System Architecture

High-Level Overview

┌─────────────┐      ┌──────────────┐      ┌─────────────┐
│  Next.js    │─────▶│  FastAPI     │─────▶│   Ollama    │
│  Frontend   │◀─────│  Backend     │◀─────│   (Local)   │
│ (Port 3000) │      │ (Port 8000)  │      │ (Port 11434)│
└─────────────┘      └──────────────┘      └─────────────┘
                            │
                            ▼
                     ┌──────────────┐
                     │   SQLite     │
                     │   Database   │
                     └──────────────┘

LLM Inference Flow

User Message: Frontend sends message to /katja/chat
Backend Receives: Loads conversation history from SQLite
Prompt Construction: Builds system prompt + context
Ollama Inference: Sends to local Ollama instance
JSON Parsing: Extracts structured response (reply + grammar correction)
Database Storage: Saves conversation to SQLite
Response: Returns to frontend

JSON Response Format

Enforced via Ollama's format: json parameter:

{
  "reply": "Conversational response in target language",
  "correction": {
    "corrected": "Corrected version of user's text",
    "explanation": "Grammar explanation",
    "pattern": "Grammar pattern ID (if applicable)",
    "severity": "minor|major|critical"
  }
}

Architecture Components

Frontend Layer (Next.js)

React 18: Component-based UI
TypeScript: Type-safe development
Tailwind CSS: Utility-first styling
react-markdown + KaTeX: LaTeX rendering
FontAwesome: Icon library

Backend Layer (FastAPI)

FastAPI: Async Python web framework
Pydantic v2: Data validation
SQLite + WAL: Database with Write-Ahead Logging
SlowAPI: Rate limiting

LLM Layer (Ollama)

Local Inference: Privacy-preserving AI
14+ Models: Llama3, Qwen2, GLM4, Gemma2, etc.
Specialized Models: Saul-7B (legal), Meditron (medical), BioMistral (biomedical)
Cloud Fallback: Optional OpenAI GPT-4o-mini

Data Layer

SQLite Database: Persistent storage
WAL Mode: Write-Ahead Logging for concurrency
Session Isolation: Unique session IDs per browser tab
Auto-Initialization: Tables created on first run

Technology Stack

Frontend Technologies

Technology	Version	Purpose
Next.js	13+	React framework with routing
React	18	UI component library
TypeScript	5	Type-safe JavaScript
Tailwind CSS	3	Utility-first CSS framework
react-markdown	Latest	Markdown rendering
KaTeX	Latest	LaTeX math rendering
FontAwesome	6	Icon library

Backend Technologies

Technology	Version	Purpose
FastAPI	0.104+	High-performance async API
Python	3.10+	Backend programming language
SQLite	3	Embedded database
Pydantic	2.0	Data validation
SlowAPI	Latest	Rate limiting

LLM Models

Model	Size	Specialty
llama3:8b-instruct-q4_K_M	4.9GB	General-purpose
adrienbrault/saul-instruct-v1:Q4_K_M	4.1GB	English law
meditron-7b	4.1GB	Maritime medical
biomistral-7b	4.1GB	Biomedical
qwen2:7b	4.4GB	Multi-lingual
glm4	9.4GB	Multi-lingual
gemma2:9b	5.4GB	General-purpose

Configuration Parameters

Parameter	Value	Purpose
Response Length	4096 tokens	Maximum response size
Temperature	0.0	Deterministic (medical/legal accuracy)
JSON Format	Enforced	Structured responses
WAL Mode	Enabled	Better database concurrency

Language Learning Mode

Overview

Language Learning mode provides real-time grammar correction and conversational practice in 7 European languages. Designed as an immersive learning tool with instant feedback.

Key Features

7 Languages: Slovene, German, Italian, Croatian, French, English, Portuguese
Real-Time Grammar Correction: Instant feedback with explanations in target language
Spaced Repetition: SM-2 algorithm for reviewing grammar patterns
Export Corrections: Save corrections as JSON for external tools
Auto-Clear on Switch: Fresh conversation when changing languages
Severity Levels: Minor, Major, Critical grammar issues

How It Works

Select target language via flag buttons
Type message in target language
Receive conversational response
Get grammar correction with explanation (if applicable)
Review corrections using spaced repetition

Grammar Correction Format

{
  "corrected": "Corrected version of text",
  "explanation": "Why the correction was needed",
  "pattern": "grammar_pattern_123",
  "severity": "major"
}

Spaced Repetition (SM-2 Algorithm)

Grammar corrections are automatically added to a spaced repetition queue using the SuperMemo-2 algorithm:

Initial Interval: 1 day
Easy Factor: 2.5 (adjusts based on performance)
Review Outcomes: Easy, Good, Hard, Again
Adaptive Scheduling: Intervals increase with successful reviews

Supported Languages Details

Language	Code	Grammar Patterns
🇸🇮 Slovene	slovene	Dual form, case system, verbal aspects
🇩🇪 German	german	Cases, verb positions, gender articles
🇮🇹 Italian	italian	Verb conjugations, articles, prepositions
🇭🇷 Croatian	croatian	Case system, verb aspects, Ijekavian/Ekavian
🇫🇷 French	french	Articles, liaisons, subjunctive mood
🇬🇧 English	english	Tenses, prepositions, articles
🇵🇹 Portuguese	portuguese	Verb conjugations, gender agreement

💡 Learning Tip: Use the spaced repetition feature regularly to reinforce grammar patterns. The system optimally schedules reviews based on your performance.

General Chat Mode

Overview

General Chat mode provides flexible AI conversations with 14+ LLM models and support for technical discussions, including LaTeX math rendering.

Key Features

14+ Models: Choose from Llama, Qwen, GLM4, Gemma, and more
LaTeX Rendering: Inline $...$ or display $$...$$ math formulas
Model Switching: Live dropdown selector
Structured Responses: Optimized system prompts for technical queries
GPT-4o-mini Support: Optional OpenAI cloud fallback
Session Memory: Conversation context maintained

LaTeX Math Support

Rendered using KaTeX for beautiful mathematical typesetting:

Inline Math

The quadratic formula is $x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$

Display Math

$$
\int_0^\infty e^{-x^2} dx = \frac{\sqrt{\pi}}{2}
$$

Example Queries

"What is Bayes' theorem and how is it used in statistics?"
"Explain quantum entanglement in simple terms"
"Write a Python function to calculate Fibonacci numbers"
"Derive the Navier-Stokes equations"

Available Models

All Ollama models installed on your system are automatically detected and available in the dropdown:

llama3:8b-instruct-q4_K_M
qwen2:7b
glm4
gemma2:9b
mistral:7b
And any other models you pull

🔬 Technical Discussions: General Chat mode is optimized for technical and academic queries with proper formatting, code blocks, and LaTeX rendering.

Legal Chat Mode

Overview

Legal Chat mode uses the specialized Saul-7B model fine-tuned for English law, contracts, torts, and legal reasoning.

Key Features

Saul-7B Model: Fine-tuned for English common law
Legal Citations: Properly formatted case references
Precedent Explanations: Detailed case law analysis
Legal Doctrine: Equity, trusts, negligence, contract law
LaTeX Support: Formatted legal citations and principles

Legal Domains Covered

Contract Law: Formation, consideration, breach, remedies
Tort Law: Negligence, duty of care, causation
Equity: Trusts, equitable remedies, fiduciary duties
Insurance Law: Good faith, utmost good faith doctrine
Case Law: Precedent analysis and legal reasoning

Example Queries

"What is the leading case in negligence?" → Donoghue v Stevenson [1932]
"Explain the doctrine of consideration in contract law"
"Is there an obligation of good faith in insurance law?"
"What are the elements of a valid trust?"

⚠️ Disclaimer: Legal Chat mode is for educational and research purposes only. It is NOT a substitute for professional legal advice. Always consult a qualified attorney for legal matters.

Saul-7B Model Details

Property	Value
Model Name	adrienbrault/saul-instruct-v1:Q4_K_M
Size	4.1GB (quantized)
Specialty	English common law
Fine-Tuned On	Legal texts, case law, doctrines
Temperature	0.0 (deterministic for accuracy)

Maritime Medical Chat Mode

Overview

Maritime Medical Chat mode is designed for ship's officers managing medical care on vessels without doctors, using Meditron and BioMistral models.

⚠️ CRITICAL LIMITATIONS:

NO DIAGNOSIS OR TREATMENT: Ship's medical officers are NOT authorized to diagnose or treat (except immediate life-saving measures)
PHYSICIAN AUTHORIZATION REQUIRED: All interventions beyond basic first aid must be authorized by shore-based physician
EXAMINATION ONLY: This tool focuses on gathering accurate information for remote physicians

Key Features

Meditron & BioMistral: Medical-domain fine-tuned models
Maritime Focus: Ship's officer medical care guidance
Examination Protocols: Systematic patient assessment
Telemedicine Prep: Organize observations for physician consultations
IMGS Protocol: International Medical Guide for Ships compliance
No Diagnosis/Treatment: Emphasizes observation and communication

Examination Guidance

The system guides ship's officers through systematic patient examination:

1. Vital Signs

Blood pressure
Heart rate
Respiratory rate
Temperature
Oxygen saturation (if available)

2. SAMPLE History

Signs and symptoms
Allergies
Medications
Past medical history
Last oral intake
Events leading to illness/injury

3. Physical Examination

General appearance
Head-to-toe examination
Neurological status
Specific area of concern

Example Queries

"How should I perform a complete physical examination on a crew member with chest pain?"
"What vital signs should I collect for the ship's doctor consultation?"
"Guide me through assessing a patient's neurological status"
"What observations should I document for a suspected appendicitis case?"

IMGS Compliance

Follows protocols from the International Medical Guide for Ships (WHO publication):

Standardized examination procedures
Telemedicine consultation preparation
Medical record documentation
Shore-based physician communication

Model Details

Model	Size	Specialty
meditron-7b	4.1GB	Maritime medical guidance
biomistral-7b	4.1GB	Biomedical terminology

⚠️ NOT A SUBSTITUTE: This is NOT a substitute for professional medical advice. Always contact a shore-based physician via TMAS (Telemedical Assistance Service) for medical guidance.

Console Mode

Overview

Console mode provides a terminal-style monitoring interface for all LLM interactions across all chat modes.

Key Features

Terminal Aesthetic: Hacker-style green monospace text
Multi-Session Monitoring: View all conversations in one place
Real-Time Updates: Auto-refresh every 3 seconds
Session Filtering: Filter by language, general, legal, medical
Model Tracking: See which model was used for each response
Timestamp Tracking: Full conversation history with precise timestamps

Console Display Format

[2025-11-10 17:30:15] [language] [llama3:8b-instruct-q4_K_M]
USER: Zdravo, kako si?
ASSISTANT: Jaz sem dobro, hvala! Kako si ti?
CORRECTION: None

[2025-11-10 17:31:22] [general] [qwen2:7b]
USER: Explain quantum entanglement
ASSISTANT: Quantum entanglement is a phenomenon where...

[2025-11-10 17:32:45] [legal] [saul-instruct-v1]
USER: What is the leading case in negligence?
ASSISTANT: The leading case is Donoghue v Stevenson [1932] AC 562...

Filtering Options

All: Show all conversations
Language Learning: Only language practice sessions
General Chat: General-purpose conversations
Legal Chat: Legal queries
Medical Chat: Medical consultations

Use Cases

Debugging: Track which models are being used
Analysis: Review conversation patterns
Export: Copy logs for external analysis
Monitoring: Watch real-time LLM responses

💻 Power User Feature: Console mode is essential for debugging model selection, prompt engineering, and understanding how different models respond to queries.

LLM Routing

Overview

Katja's LLM routing system intelligently selects between local Ollama models and cloud OpenAI fallback based on availability and configuration.

Routing Logic

1. Check LLM_PROVIDER in .env
   ├─ "local" → Use Ollama
   └─ "openai" → Use OpenAI API

2. If Ollama selected:
   ├─ Check if Ollama is running (http://localhost:11434)
   ├─ If available → Use selected Ollama model
   └─ If unavailable AND OPENAI_API_KEY set → Fallback to GPT-4o-mini

3. If OpenAI selected:
   └─ Use GPT-4o-mini directly

Model Selection Priority

User Selection: Model selected in UI dropdown (highest priority)
Environment Variable: OLLAMA_MODEL in .env
Default Model: llama3:8b-instruct-q4_K_M
Cloud Fallback: gpt-4o-mini (if configured)

Automatic Fallback

If local Ollama is unavailable and OpenAI API key is configured:

try:
    response = ollama_client.chat(model=selected_model, ...)
except Exception as e:
    if OPENAI_API_KEY:
        logger.warning("Ollama unavailable, falling back to GPT-4o-mini")
        response = openai_client.chat(model="gpt-4o-mini", ...)
    else:
        raise Exception("Ollama unavailable and no OpenAI fallback configured")

Model Discovery

Available models are automatically discovered from Ollama:

GET http://localhost:11434/api/tags

Response:
{
  "models": [
    {"name": "llama3:8b-instruct-q4_K_M"},
    {"name": "qwen2:7b"},
    {"name": "saul-instruct-v1:Q4_K_M"},
    ...
  ]
}

🔄 Dynamic Model List: The model dropdown automatically refreshes when new models are pulled via ollama pull.

Grammar Correction System

Overview

The grammar correction system analyzes user input in the target language and provides detailed feedback with explanations.

Correction Process

Input Analysis: User message parsed by LLM
Error Detection: Grammar, syntax, spelling issues identified
Severity Assignment: Minor, Major, or Critical
Correction Generation: Corrected version created
Explanation: Why the correction was needed (in target language)
Pattern Matching: Link to grammar pattern database (if applicable)

Severity Levels

Severity	Description	Example
Minor	Small typo or informal usage	Missing accent mark, capitalization
Major	Grammar mistake affecting clarity	Wrong verb tense, incorrect case
Critical	Fundamental error changing meaning	Wrong gender, completely incorrect structure

Grammar Pattern Database

Pre-seeded with common grammar patterns for each language:

# German example
{
  "pattern_id": "de_case_accusative",
  "language": "german",
  "category": "cases",
  "rule": "Accusative case after verbs of motion with direction",
  "example": "Ich gehe in den Park (not in der Park)",
  "difficulty": "intermediate"
}

Correction JSON Schema

{
  "corrected": "Corrected version of the text",
  "explanation": "Detailed explanation in target language",
  "pattern": "grammar_pattern_id or null",
  "severity": "minor | major | critical"
}

📚 Learning Database: All corrections are stored in SQLite for spaced repetition review, helping you systematically improve grammar patterns over time.

Spaced Repetition System

Overview

Katja implements the SuperMemo-2 (SM-2) algorithm for optimally scheduling grammar correction reviews.

SM-2 Algorithm

The algorithm adjusts review intervals based on your performance:

Parameters

EF (Easiness Factor): Starts at 2.5
Interval: Days until next review
Repetitions: Number of successful reviews

Review Outcomes

Outcome	Quality	Effect
Again	0	Reset interval to 1 day
Hard	3	Slightly increase interval, reduce EF
Good	4	Normal interval increase
Easy	5	Large interval increase, increase EF

Interval Calculation

if quality < 3:
    # Failed - reset
    interval = 1
    repetitions = 0
else:
    if repetitions == 0:
        interval = 1
    elif repetitions == 1:
        interval = 6
    else:
        interval = previous_interval * EF

    # Update EF
    EF = EF + (0.1 - (5 - quality) * (0.08 + (5 - quality) * 0.02))
    if EF < 1.3:
        EF = 1.3

    repetitions += 1

Review Workflow

Navigate to Reviews page (/reviews)
View corrections due for review
For each correction:
- See original text
- Recall correct version
- Rate difficulty: Again, Hard, Good, Easy
Next review scheduled automatically

🧠 Optimal Learning: The SM-2 algorithm is scientifically proven to maximize long-term retention while minimizing study time.

Session Management

Overview

Katja uses unique session IDs to isolate conversations across different browser tabs and chat modes.

Session ID Generation

// Frontend (localStorage)
const sessionId = localStorage.getItem('sessionId') || generateUUID();
localStorage.setItem('sessionId', sessionId);

Session Isolation

Per Browser Tab: Each tab gets unique session ID
Persistent: Survives page refreshes
Mode-Specific: Language learning vs general chat have separate histories
Database Storage: All sessions stored in SQLite

Conversation Context

Each API call includes session history for context-aware responses:

POST /katja/chat
{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "language": "german",
  "text": "Wie geht es dir?"
}

Backend loads history:
SELECT * FROM conversation_history
WHERE session_id = ?
ORDER BY timestamp DESC
LIMIT 10

Clearing Sessions

Each mode has a clear button to reset conversation:

DELETE /katja/clear?session_id={sessionId}

# Deletes all conversation history for this session

🔒 Privacy: Session IDs are generated client-side and stored in browser localStorage. No cross-session data leakage.

API Reference Overview

Base URL

http://localhost:8000

Authentication

No authentication required for local development. CORS origins configured in .env file.

Response Format

All responses are JSON:

{
  "reply": "Assistant response",
  "correction": {
    "corrected": "Corrected text",
    "explanation": "Explanation",
    "pattern": "pattern_id",
    "severity": "major"
  }
}

Error Handling

{
  "detail": "Error message",
  "type": "error_type"
}

Common HTTP Status Codes

Code	Meaning
200	OK - Request successful
400	Bad Request - Invalid input
500	Internal Server Error - LLM or database error
503	Service Unavailable - Ollama not running

Rate Limiting

Implemented via SlowAPI:

Default: 60 requests per minute per IP
Configurable: Adjust in backend code

Chat API

POST /katja/chat

Language learning chat with grammar correction

Source: backend/routers/teacher.py

# Request
{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "language": "german",
  "text": "Ich gehe in der Park"
}

# Response
{
  "reply": "In den Park! Du meinst 'Ich gehe in den Park'.",
  "correction": {
    "corrected": "Ich gehe in den Park",
    "explanation": "Nach 'in' mit Bewegungsrichtung steht der Akkusativ",
    "pattern": "de_case_accusative",
    "severity": "major"
  }
}

POST /katja/generic-chat

General/Legal/Medical chat

Source: backend/routers/teacher.py

# Request
{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "text": "What is the quadratic formula?",
  "model": "llama3:8b-instruct-q4_K_M"
}

# Response
{
  "reply": "The quadratic formula is $x = \\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a}$"
}

DELETE /katja/clear

Clear language learning conversation history

DELETE /katja/clear?session_id=550e8400-e29b-41d4-a716-446655440000

Response: { "status": "cleared" }

DELETE /katja/generic-chat/clear

Clear generic chat history

DELETE /katja/generic-chat/clear?session_id=550e8400-e29b-41d4-a716-446655440000

Response: { "status": "cleared" }

Models API

GET /katja/models

List all available Ollama models

Source: backend/routers/system.py

# Response
{
  "models": [
    "llama3:8b-instruct-q4_K_M",
    "qwen2:7b",
    "glm4",
    "saul-instruct-v1:Q4_K_M",
    "meditron-7b",
    "biomistral-7b"
  ]
}

GET /katja/system/models

List models with metadata

# Response
{
  "models": [
    {
      "name": "llama3:8b-instruct-q4_K_M",
      "size": "4.9GB",
      "modified": "2025-10-15T10:30:00Z"
    }
  ]
}

POST /katja/system/models/select

Switch active model

# Request
{
  "model": "qwen2:7b"
}

# Response
{
  "status": "model_selected",
  "model": "qwen2:7b"
}

Console API

GET /katja/console/history

Get conversation history for console

Source: backend/routers/system.py

# Query Parameters
?type=language  # language | general | all
?limit=100

# Response
{
  "history": [
    {
      "timestamp": "2025-11-10T17:30:15Z",
      "type": "language",
      "model": "llama3:8b-instruct-q4_K_M",
      "user_message": "Zdravo",
      "assistant_reply": "Živjo! Kako si?",
      "correction": null
    }
  ]
}

GET /katja/system/logs

Stream backend logs (real-time)

GET /katja/system/logs

# Streaming response (Server-Sent Events)
data: [2025-11-10 17:30:15] INFO: Chat request received
data: [2025-11-10 17:30:16] INFO: Using model llama3:8b-instruct-q4_K_M
data: [2025-11-10 17:30:18] INFO: Response generated successfully

Database Schema

Overview

Katja uses SQLite with WAL (Write-Ahead Logging) mode for better concurrency. Database auto-initializes on first startup.

Tables

user_profile

CREATE TABLE user_profile (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT UNIQUE NOT NULL,
  current_language TEXT DEFAULT 'english',
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  total_corrections INTEGER DEFAULT 0,
  total_conversations INTEGER DEFAULT 0
);

conversation_history

CREATE TABLE conversation_history (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT NOT NULL,
  language TEXT NOT NULL,
  user_message TEXT NOT NULL,
  assistant_reply TEXT NOT NULL,
  original_text TEXT,
  corrected_text TEXT,
  timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (session_id) REFERENCES user_profile(session_id)
);

grammar_corrections

CREATE TABLE grammar_corrections (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT NOT NULL,
  language TEXT NOT NULL,
  original TEXT NOT NULL,
  corrected TEXT NOT NULL,
  explanation TEXT,
  pattern_id TEXT,
  severity TEXT,
  next_review_date DATE,
  easiness_factor REAL DEFAULT 2.5,
  interval INTEGER DEFAULT 0,
  repetitions INTEGER DEFAULT 0,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (session_id) REFERENCES user_profile(session_id),
  FOREIGN KEY (pattern_id) REFERENCES grammar_patterns(pattern_id)
);

grammar_patterns

CREATE TABLE grammar_patterns (
  pattern_id TEXT PRIMARY KEY,
  language TEXT NOT NULL,
  category TEXT,
  rule TEXT NOT NULL,
  example TEXT,
  difficulty TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

generic_conversations

CREATE TABLE generic_conversations (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT NOT NULL,
  role TEXT NOT NULL,  -- 'user' or 'assistant'
  content TEXT NOT NULL,
  model_used TEXT,
  timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Indexes

CREATE INDEX idx_corrections_session ON grammar_corrections(session_id);
CREATE INDEX idx_corrections_review ON grammar_corrections(next_review_date);
CREATE INDEX idx_conversations_session ON conversation_history(session_id);
CREATE INDEX idx_generic_session ON generic_conversations(session_id);

WAL Mode

Enabled for better concurrency (multiple reads during writes):

PRAGMA journal_mode = WAL;

📊 Database Location: backend/data/katja.db (auto-created, not tracked in git)

Model Selection Guide

Recommended Models by Use Case

Language Learning

Model	Size	Best For
llama3:8b-instruct-q4_K_M	4.9GB	European languages, grammar accuracy
qwen2:7b	4.4GB	Multi-lingual, Asian languages
glm4	9.4GB	Best multi-lingual support

General Chat

Model	Size	Best For
llama3:8b-instruct-q4_K_M	4.9GB	General knowledge, coding
gemma2:9b	5.4GB	Technical discussions
qwen2:7b	4.4GB	Math, science

Legal Chat

Model	Size	Best For
adrienbrault/saul-instruct-v1:Q4_K_M	4.1GB	REQUIRED - English law specialist

Medical Chat

Model	Size	Best For
meditron-7b	4.1GB	Maritime medical guidance
biomistral-7b	4.1GB	Biomedical terminology

Model Installation

# Pull specific model
ollama pull llama3:8b-instruct-q4_K_M

# List installed models
ollama list

# Remove model
ollama rm model-name

Model Performance

Approximate inference speed on typical hardware (16GB RAM, CPU only):

4GB models: ~20-30 tokens/sec
7GB models: ~15-25 tokens/sec
9GB models: ~10-20 tokens/sec

⚠️ GPU Acceleration: For faster inference, install Ollama with GPU support (CUDA for NVIDIA, ROCm for AMD).

Customization & Configuration

System Prompt Customization

Navigate to Settings page (/settings) to customize system prompts for each mode:

Features

Live Editor: Monaco-style textarea for editing
Browser Storage: Changes saved to localStorage
Reset to Defaults: Restore original prompts
Preview: See formatted prompt before saving
Instant Effect: Changes apply on next message

Customization Tips

Adjust tone (formal/casual)
Add domain-specific knowledge
Change output format preferences
Include LaTeX notation guidance

Environment Variables

Configurable in backend/.env:

# LLM Provider
LLM_PROVIDER=local  # local | openai

# Ollama Configuration
OLLAMA_MODEL=llama3:8b-instruct-q4_K_M
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_TIMEOUT=300  # seconds

# OpenAI Fallback (optional)
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o-mini

# CORS
ALLOWED_ORIGINS=http://localhost:3000,http://localhost:3001

# Database
DATABASE_PATH=./data/katja.db

# Logging
LOG_LEVEL=INFO  # DEBUG | INFO | WARNING | ERROR

Frontend Configuration

Edit frontend/config/systemPrompts.ts for default prompts:

export const DEFAULT_PROMPTS = {
  languageLearning: "You are a friendly language teacher...",
  general: "You are a helpful AI assistant...",
  legal: "You are a legal expert specializing in English common law...",
  medical: "You are a maritime medical advisor..."
};

Port Configuration

# Frontend (package.json)
"scripts": {
  "dev": "next dev -p 3000"
}

# Backend (uvicorn command)
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

🔧 Advanced Customization: All prompts, models, and configurations are fully customizable for your specific use case.