HyperAIHyperAI

Command Palette

Search for a command to run...

tosijs-schema: A Fast, Schema-First TypeScript Library for Efficient Data Validation and LLM Integration

tosijs-schema is a lightweight, schema-first TypeScript and JavaScript library designed to generate a single, standards-compliant source of truth for data types using JSON Schema. It allows developers to define data structures once using a clean, property-based syntax and automatically infers corresponding TypeScript types. The library emphasizes performance, correctness, and simplicity, making it ideal for large-scale applications and AI-driven systems. At its core, tosijs-schema uses a functional, chainable API. For example, defining a user schema involves simple, readable expressions like s.string.uuid, s.email, s.enum, and s.integer.min(0).max(10). These properties are getters, reducing boilerplate and keeping definitions concise. Metadata such as title, description, and default values can be attached to any schema node without affecting validation—making it perfect for generating rich API documentation compatible with Swagger or OpenAPI. The library automatically infers TypeScript types via the Infer utility, ensuring type safety at compile time. For instance, Infer produces a precise type definition matching the schema, including optional fields and constrained values. Performance is a key strength. tosijs-schema uses a "prime-jump" sampling strategy for validating large arrays and dictionaries. Instead of checking every item, it samples a fixed percentage (around 1%) of elements, achieving O(1) performance regardless of size. This provides near-instant validation on massive datasets while maintaining a high statistical confidence in error detection. For cases requiring 100% accuracy—such as financial systems—full scanning can be enabled. Validation is fast by default, returning a boolean. For debugging, detailed error messages can be captured using an onError callback, or validation can throw immediately. The validator follows a "fail-fast" approach, stopping at the first error to minimize CPU usage. The library supports advanced features like minProperties and maxProperties on objects and records, though it treats maxProperties with caution. Since counting all keys in a large object is an O(N) operation, and malicious payloads could exploit this, tosijs-schema assumes that such limits are enforced at higher levels (e.g., in business logic or databases). This design choice keeps the validator efficient and secure. A major advantage is its compatibility with LLMs. tosijs-schema is natively JSON Schema, so it works seamlessly with OpenAI’s response_format: { type: "json_schema" } and Anthropic’s tool use. Unlike Zod, which requires a third-party adapter and often produces verbose, deeply nested schema structures, tosijs-schema outputs clean, flat, minimal JSON Schema. This reduces token usage in context windows and avoids unnecessary complexity. In benchmarks, tosijs-schema outperforms Zod by up to 1,124x in optimized, long-running scenarios—especially on large arrays and dictionaries. Even in cold-start conditions, it is significantly faster, and the performance gap widens after JIT compilation. tosijs-schema is MIT-licensed, has zero dependencies, and is built for speed, size, and clarity. It is a functional, schema-first alternative to class-based, TypeScript-first libraries like Zod—ideal for high-performance systems, serverless environments, and AI agent development.

Related Links

Hacker NewsHacker News