
- AR-NLU: A Framework for Enhancing Natural Language Understanding Model Robustness against ASR Errors, June 2024. A major challenge with pipeline spoken language understanding systems is that errors in the upstream automatic speech recognition (ASR) engine adversely impact downstream natural language understanding (NLU) models. To address this challenge, we propose an ASR-Robust NLU (AR-NLU) framework that extends a pre-existing NLU model by training it simultaneously on two input streams: human generated or gold transcripts and noisy ASR transcripts.
- Personas within Parameters: Fine-Tuning Small Language Models with Low-Rank Adapters to Mimic User Behaviors, December 2024. Utilized dataset distillation and low-rank fine-tuning to enhance Small Language Models (SLMs) for simulating user agents in recommender systems. Our experiments provide compelling empirical evidence of the efficacy of our methods, demonstrating that user agents developed using our approach have the potential to bridge the gap between offline metrics and real-world performance of recommender systems.
- Enhancing Contract Negotiations with LLM-Based Legal Document Comparison, October 2024. This approach is the first in the literature to produce a natural language comparison between legal contracts and their template documents.
- Systematic Evaluation of Long-Context LLMs on Financial Concepts, October 2024. Evaluated the performance of state-of-the-art GPT-4 suite of LC LLMs in solving a series of progressively challenging tasks, as a function of factors such as context length, task difficulty, and position of key information by creating a real world financial news dataset.
- When and how to paraphrase for named entity recognition?, May 2023. Utilized simple strategies to annotate entity spans in generations and compare established and novel methods of paraphrasing in NLP such as back translation, specialized encoder-decoder models such as Pegasus, and GPT-3 variants for their effectiveness in improving downstream performance for NER across different levels of gold annotations and paraphrasing strength on 5 datasets.
- EEGNN: Edge Enhanced Graph Neural Network with a Bayesian Nonparametric Graph Model, August 2022. Training deep graph neural networks (GNNs) poses a challenging task, as the performance of GNNs may suffer from the number of hidden message-passing layers. The literature has focused on the proposals of {over-smoothing} and {under-reaching} to explain the performance deterioration of deep GNNs. In this paper, we propose a new explanation for such deteriorated performance phenomenon, {mis-simplification}, that is, mistakenly simplifying graphs by preventing self-loops and forcing edges to be unweighted.
- Towards Data Efficient And Robust Speech Representation Model Distillation, November 2022. Our research aims to further improve the efficiency in task-agnostic speech representation model pre-training. By perturbing the training data distribution, we distil a more robust task-agnostic speech representation model with a lower training data requirement.
- Knowledge Graphs Introduction, History and Perspectives, March 2022. The goals of this article are to (a) introduce Knowledge Graphs (KGs) and discuss important areas of application that have gained recent prominence; (b) situate KGs in the context of the prior work in AI; and (c) present a few contrasting perspectives that help in better understanding KGs in relation to related technologies.
- Model-based Reinforcement Learning for Predictions and Control for Limit Order Books, October 2019. We build a profitable electronic trading agent with Reinforcement Learning that places buy and sell orders in the stock market. We demonstrate that the trading policy trained entirely within the environment model can be transferred back into the real market and maintain its profitability.
- Towards Inverse Reinforcement Learning for Limit Order Book Dynamics, June 2019. This paper investigates whether IRL can infer such rewards from agents within real financial stochastic environments: limit order books (LOB).

Published Patents
- System and Method for Implementing a Client Sentiment Analysis Tool, IDF-01374-US02
- System and Method for Implementing an Intelligent Customer Service Query Management and Routing System, IDF-01458-US02
- System and Method for Generating and Implementing Context Weighted Words, IDF-01581-US02
- Systems and Methods for Contingency NAV Pricing, IDF-01621-HK01
- Systems and Methods for Contingency Net Asset Value Pricing, IDF-01621-EP01
- Field Management Continuous Learning System and Method, IDF-02441-US02
- Systems and Methods for Auto Discovery of Sensitive Data in Application or Databases Using Metadata Via Machine Learning Techniques, IDF-02511A-US02
- System and Method for Counteracting Data-Skewness for Locality Sensitive Hashing Via Feature Selection and Pruning, IDF-2020-0144-US01
- System and Method for Ultra-High Dimensional Hawkes Processes, IDF-2020-0194-US01
- System and Method for Ultra-High Dimensional Hawkes Processes, IDF-2020-0194-WO01
- Graph-To-Signal Domain Based Data Interconnection Classification System and Method, IDF-2020-0256-US01
- System and Method for End-To-End Neural Entity Linking, IDF-2020-0335-US02
- System and Method for Automated Code Analysis and Tagging, IDF-2021-0002-CN01
- System and Method for Federated Secure Vocabulary Learning, IDF-2021-0164-US01
- System and Method for Scalable Biometric Authentication, IDF-2021-0238-US03
- Systems and Methods for Noise Agnostic Federated Learning, IDF-2021-0261-US01
- Systems and Methods for Automated Data Quality Semantic Constraint Identification Using Rich Data Type Inferences, IDF-2021-0371-US02