99 - ai-proxy.cm.itcollege.ee
School provides some limited LLM api access for students.
There is proxy server that authenticates students and forwards requests to LLM api. Proxy is designed to do minimal changes to original api.
Proxy is located at: https://ai-proxy.cm.itcollege.ee/
Main functionality is:
- manage student accounts
- monitor usage
- cost and credit control
- handle students personal api keys
- llm api configuration
- removes student api key and replaces with school api key
- proxies requests as-is, including SSE support
- can do some request/response transformations - query, headers, etc.
Currently main LLM provider is Azure, using Central Sweden region (EU, GDPR, etc).
Configuration
Mainly 2 options are used - OpenAI chat completions or OpenAi responses (newer).
gpt-5.2-codex is using responses, other models are mostly using chat completions.
Azure does not have models endpoint, so we need to hardcode model names (names are case sensitive and sometimes different from original model names).
If configuration does not allow to custom model name, proxy can be configured to change model name during every request/response.
In theory there is also Anthropic models (Opus) with their own messages api - ask for access if you need it (expensive).
Example costs are experimental, analyzing ai-proxy codebase (ask mode). Prompt used was:
Analyze this codebase thoroughly. Produce a structured report covering:
## 1. Purpose & Domain
- What does this application do? Core business domain.
- Target users/consumers (API clients, end users, both?)
- Key workflows and use cases
## 2. Architecture
- Overall pattern (Clean/Onion, N-tier, Vertical Slices, CQRS, etc.)
- Project/solution structure - list all projects and their responsibilities
- Dependency graph between projects
- DI registration patterns and service lifetimes
- Middleware pipeline order
- Any architectural violations or circular dependencies
## 3. Data Layer & ERD
- ORM used (EF Core version, Dapper, etc.)
- List all entities and their relationships
- Generate a Mermaid ERD diagram
- Migration strategy (code-first, DB-first?)
- Identify: soft deletes, audit fields, tenant isolation, concurrency tokens
- Raw SQL or stored procedures usage
- Connection/DbContext management (single, multi-context, pooling)
## 4. API Surface
- List all controllers/endpoints with HTTP methods and routes
- Authentication/authorization scheme (JWT, cookies, Identity, external providers)
- API versioning strategy
- Request validation approach (FluentValidation, DataAnnotations, manual)
- Response patterns (envelope/wrapper, ProblemDetails, raw)
- Rate limiting, CORS config
## 5. Test Coverage
- Test projects and frameworks used (xUnit, NUnit, MSTest)
- Approximate coverage: unit, integration, E2E
- What's tested well vs. what's NOT tested (be specific)
- Test data strategy (fixtures, builders, AutoFixture, Bogus)
- Integration test infrastructure (WebApplicationFactory, Testcontainers, in-memory DB)
## 6. Coding Style & Patterns
- C# version features used (nullable refs, primary constructors, records, etc.)
- Naming conventions (consistent? violations?)
- Error handling strategy (exceptions, Result pattern, both?)
- Logging approach and structured logging usage
- Mapping strategy (AutoMapper, Mapster, manual)
- Async/await correctness (ConfigureAwait, fire-and-forget, deadlock risks)
## 7. Configuration & Deployment
- Configuration sources (appsettings, env vars, user secrets, key vault)
- Environment-specific configs
- Docker support (Dockerfile quality, compose setup)
- Health checks
- HTTPS/TLS setup
## 8. Production Readiness Assessment
Rate each 1-5 with justification:
- Security (auth, input validation, secrets management, OWASP top 10)
- Performance (N+1 queries, missing indexes hints, caching, pagination)
- Observability (logging, metrics, tracing, correlation IDs)
- Error handling (global handler, graceful degradation)
- Scalability (stateless?, session affinity, background jobs)
- Maintainability (code duplication, dead code, TODOs, tech debt)
## 9. Red Flags & Debt
- List specific code smells, anti-patterns, security vulnerabilities
- Hardcoded values, magic strings/numbers
- Missing null checks in nullable context
- Synchronous over async or vice versa
- Any God classes or 500+ line methods
## 10. Summary
- One-paragraph executive summary
- Top 5 things to fix immediately
- Top 5 strengths
Kilo Code configurations (tested on 2026 feb)
NB! Model names are case sensitive.
grok-code-fast-1
Config params
- Api Provider: OpenAI Compatible
- Base URL: https://ai-proxy.cm.itcollege.ee/azure-models
- Api Key: Your personal key from ai-proxy
- Model: grok-code-fast-1 (use custom model name)
- Enable Streaming: true
- Enable reasoning effort: medium, high, extra high (possible, but costly)
- Context windows size: 256k
- Image support: false
- Input price: 0.17 mtok
- Output price: 1.26 mtok
Demo task cost (no reasoning effort set): 0.14. Time: 1 minutes. 20 requests.
Kimi-K2.5
Config params
- Api Provider: OpenAI Compatible
- Base URL: https://ai-proxy.cm.itcollege.ee/azure-models
- Api Key: Your personal key from ai-proxy
- Model: Kimi-K2.5 (use custom model name)
- Enable Streaming: true
- Enable reasoning effort: medium, high, extra high (possible, but costly)
- Context windows size: 256k
- Image support: true
- Input price: 0.45 mtok
- Output price: 2.25 mtok
Demo task cost (no reasoning effort set): 0.60. Time: 12 minutes. 30 requests
gpt-5.2-codex
Config params
- Api Provider: OpenAI Compatible (Responses)
- Base URL: https://ai-proxy.cm.itcollege.ee/azure-openai
- Api Key: Your personal key from ai-proxy
- Model: gpt-5.2-codex (use custom model name)
- Enable Streaming: true
- Enable reasoning effort: medium, high, extra high (possible, but costly)
- Context windows size: 400k
- Image support: true
- Input price: 1.75 mtok
- Output price: 14.00 mtok
Demo task cost (no reasoning effort set): 1.28. Time: 3 minutes. 18 requests.
Opus 4.6
- Context windows size: 200k, 1m possible
- Input price: 5.00 mtok
- Output price: 25.00 mtok
Demo task cost (no reasoning effort set): 1.33. Time: 2 minutes. 23 requests.