Zikun’s research

Resume

1 Synthetic Data Augmentation for Database Entity Recognition

We study the Database Entity Recognition (DB-ER): identifying database tables, columns, and values in natural-language queries (NLQs), a core but under-isolated subtask in text-to-SQL systems. We reformulate DB-ER as a closed-domain NER problem grounded in database schemas and SQL structure. We construct a benchmark derived from Spider and BIRD, introduce an automatic SQL-guided data augmentation method to scale training data, and propose a T5-based two-stage tagging model. Experiments show that synthetic supervision and encoder fine-tuning substantially improve precision and recall, and that the proposed approach outperforms strong NER baselines such as LUKE and Flair on DB-specific entity types.

Key Contributions:

  • DB-ER Benchmark from Text-to-SQL Data
  • SQL-Guided Data Augmentation via Synthetic Annotation
  • Specialized T5-Based DB-ER Model
  • Evaluation and Ablation Analysis Against Strong Baselines
Note

This work was under the supervision of Professor Ken Pu and Professor Kourosh Davoudi.

Publication at IRI 2025

Publication at IRI 2025

Data Annotation UI

Data Annotation UI

Presenting at IRI 2025

Presenting at IRI 2025

2 GREx An Educational Survey Dashboard (https://grex.eilab.ca/)

I was responsible for the development of the GREx An Educational Survey Dashboard, which is a web application that allows users to create and take surveys. I was responsible for the development of the front-end and back-end of the application. I was also responsible for the development of the database schema and the deployment of the application.

GREx Index

GREx Index

GREx Visualizer 1

GREx Visualizer 1

GREx Visualizer 2

GREx Visualizer 2
Note

This work was under the supervision of Professor Roland van Oostveen, (https://eilab.ca/)