By Armen Avetisyan, Senior Software Engineer at OneTick
onetick-py is a high-performance Python library for tick data processing, built on OneTick’s powerful analytics engine. It features a Pandas-like API, enabling operations on tick data using native Python expressions, built-ins, and vectorized computations. Under the hood, it translates Python operations into OneTick query language, executing on a specialized tick server optimized for low-latency and high-throughput analytics. With access to data from 200+ exchanges from our on-demand market data service, it supports advanced time-series analysis, aggregation, filtering, and event-driven computations at scale.
onetick-py is used internally at OneMarketData and is also distributed to select clients alongside our products. Due to the limited number of users writing onetick-py code, developers primarily rely on documentation and direct communication via Slack or other channels, creating a bottleneck when debugging errors occur—often requiring extra time to navigate documentation or wait for responses.
While its interface resembles Pandas, effectively writing onetick-py queries requires a deep understanding of tick data and the overall OneTick architecture to ensure well-structured and optimized code. The complexities of analyzing time-series data across multiple symbols further add to the learning curve, making query writing challenging for developers new to onetick-py.
Unlike general-purpose programming languages, domain-specific languages (DSLs) like onetick-py have specialized syntax, semantics, and execution models tailored to specific use cases. However, this specialization also introduces challenges when leveraging large language models (LLMs) for code generation.
Major LLM vendors, such as OpenAI, do not include onetick-py in their training datasets. As a result, out-of-the-box models lack an understanding of its unique constructs and execution behavior. Consequently, attempts to generate onetick-py code using generic LLMs often produce incorrect or inefficient queries that fail to fully utilize OneTick’s capabilities.
To address this limitation, DSLs like onetick-py must develop their own knowledge bases. This involves integrating domain-specific documentation, examples, and best practices to ensure that generated code is accurate, optimized, and aligned with OneTick’s query language.
To streamline development and support both internal teams and clients working with onetick-py, an AI-powered Coding Assistant was introduced. The Coding Assistant provides a web-based interface where users can ask coding questions and receive onetick-py specific code snippets. By leveraging generative AI, it helps developers write, debug, and optimize queries more efficiently, reducing reliance on manual documentation and peer assistance.
The Coding Assistant is particularly valuable for onboarding new employees responsible for writing onetick-py code, accelerating their learning process. Additionally, it significantly enhances the experience for clients using onetick-py, making it easier for them to develop well-structured queries without deep prior expertise.
The Coding Assistant is built on Langchain / LangGraph, a framework that enables seamless integration with various language models. This architecture provides flexibility in selecting and fine-tuning models, allowing the assistant to adapt to different use cases. LangGraph excels at constructing workflows that can revisit and reuse specific components, enhancing efficiency and adaptability.
Langfuse is utilized for tracing queries and their results, ensuring transparency and insight into the assistant’s operations.
Leveraging generative AI models from OpenAI, the assistant translates natural language prompts into functional onetick-py scripts. This empowers developers to generate code efficiently, reducing manual effort and accelerating the development process.
The overall system architecture is represented in the following diagram:
The Index Builder retrieves and updates documentation from multiple sources, including:
It uses an embedding model to generate data embeddings, storing them in a pgvector vector database. This component is responsible for continuously gathering and updating documentation on a weekly basis.
The Coding Assistant’s workflow consists of several key steps:
We are currently experimenting with integrating our Coding Assistant into Jupyter AI to enable seamless functionality within the Jupyter Lab environment.
Evaluation tests are conducted to assess the performance of models and prompts, ensuring the reliability and accuracy of code generation. This process helps determine how changes in documentation, prompts, or models impact the overall functionality of the Coding Assistant. By systematically evaluating these factors, we can transition between models while measuring their effect on code generation quality.
We run evaluation tests by comparing the generated code output against the expected result. If a test fails, a tracking mechanism in Langfuse logs the outcome as True (successful) or False (failed). In case of failure, the system also records the error message for further analysis.
Based on test results, prompts can be refined and the evaluation test suite can be re-run to continuously improve the Coding Assistant’s performance.
The onetick-py Coding Assistant represents a significant advancement in simplifying and accelerating the development of onetick-py queries. By leveraging generative AI models, LangGraph for workflow optimization, and Langfuse for tracing, the assistant provides an intuitive interface for developers to generate code efficiently.
With its ability to retrieve relevant documentation, apply few-shot learning, and iterate on code corrections using a reflection mechanism, the assistant significantly reduces the learning curve for new developers and enhances productivity for experienced users. Its integration into Jupyter AI further extends its usability, making onetick-py development more seamless and accessible.
Through continuous evaluation and refinement of models and prompts, the Coding Assistant is designed to evolve, ensuring that it remains a reliable tool for both internal teams and clients. As we further enhance its capabilities, it will continue to bridge the gap between natural language queries and optimized onetick-py code, streamlining the development process and improving overall efficiency.
Want to try it out for yourself? Set up a meeting with the OneTick team today to guide you through the process.
— Armen Avetisyan
OneTick Senior Software Engineer at OneTick