We no longer support this browser. Using a supported browser will provide a better experience.

Please update your browser.

Close browser message

As a global leader, we deliver strategic advice and solutions, including capital raising, risk management, and trade finance services to corporations, institutions and governments.

Learn more about our solutions:


Serving the world's largest corporate clients and institutional investors, we support the entire investment cycle with market-leading research, analytics, execution and investor services.

Learn more about our solutions:


We are a leader in investment management, dedicating to creating a strategic advantage for institutions by connecting clients with J.P. Morgan investment professionals globally.

Learn more about our solutions:


Our financial advisors create solutions addressing strategic investment approaches, professional portfolio management and a broad range of wealth management services.

Learn more about our solutions:


Leverages cutting-edge technologies and innovative tools to bring clients industry-leading analysis and investment advice.

Learn more:


The latest news and announcements.

Learn more:


For company information and brand assets for editorial use.

Learn more:


The latest news and announcements.

Learn more:


In a fast-moving and increasingly complex global economy, our success depends on how faithfully we adhere to our core principles: delivering exceptional client service; acting with integrity and responsibility; and supporting the growth of our employees.

Learn more:


J.P. Morgan is a global leader in financial services, offering solutions to the world's most important corporations, governments and institutions in more than 100 countries. As announced in early 2018, JPMorgan Chase will deploy $1.75 billion in philanthropic capital around the world by 2023. We also lead volunteer service activities for employees in local communities by utilizing our many resources, including those that stem from access to capital, economies of scale, global reach and expertise.

Learn more:


With over 50,000 technologists across 21 Global Technology Centers, globally, we design, build and deploy technology that enable solutions that are transforming the financial services industry and beyond.

Learn more:


Technology Banner

For general inquiries regarding JPMorgan Chase & Co. or other lines of business, please call +1 212 270 6000.

Learn more:


For general inquiries regarding JPMorgan Chase & Co. or other lines of business, please call +1 212 270 6000.

Learn more:


Learning More From Less Data With Active Learning

How JPMC is combining the power of machine learning and human intelligence to create high-performance models in less time and at less cost.

A key barrier for companies to adopt machine learning is not lack of data but lack of labeled data. Labeling data gets expensive, and the difficulties of sharing and managing large datasets for model development make it a struggle to get machine learning projects off the ground.

That’s where our “learn more from less data” approach comes into action. At JPMorgan Chase, we are focused on reducing the need for data to build models. Instead, we focus on building gold training datasets, helping reduce the labeling cost and increasing the agility of model development.

Labeled data is a group of samples that have been tagged with one or more labels. After obtaining a labeled dataset, machine learning models can be applied to the data so that new, unlabeled data can be presented to the model and a likely label can be guessed or predicted for that piece of unlabeled data. A gold training dataset is a small, labeled dataset with high predictive power.

So Where Does Active Learning Come In?

Active learning is a form of semi-supervised learning, which works well when you have a lot of data but face the expense of getting that data labeled. By labeling data points that help the quality of the model, teams can identify the samples that are most informative.

Using machine learning (ML) models, active learning can help identify difficult data points and ask a human annotator to focus on labeling them.

To explain passive learning and active learning, let’s use the analogy of teacher and student. In the passive learning approach, a student learns by listening to the teacher's lecture. In active learning, the teacher describes concepts, students ask questions, and the teacher spends more time explaining the concepts that are difficult for a student to understand. Student and teacher interact and collaborate in the learning process.

In ML model development using active learning, annotator and modeler interact and collaborate. An annotator provides a small labeled dataset. The modeling team builds a model and generates input on what to label next. Within a few iterations, teams can build refined requirements, a labeled gold training set, active learner and working machine learning model.

How We Identify Difficult Data Points

To identify difficult data points, we use a combination of methods, including:

  • Classification uncertainty sampling: When querying for labels, the strategy selects the sample with the highest uncertainty — data points the model knows least about. Labeling these data points makes the ML model more knowledgeable.

  • Margin uncertainty: When querying for labels, the strategy selects the sample with the smallest margin. These are data points the model knows about but isn’t confident enough to make good classifications. Labeling these examples increase model accuracy.

  • Entropy sampling: Entropy is a measure of uncertainty. It is proportional to the average number of guesses one has to make to find the true class. In this approach, we pick the samples with the highest entropy.

  • Disagreement-based sampling: While using this method, we pick those samples where different algorithms disagree. Example: if model is classifying into 5 classes (A,B, C, D & E), and if we are using 5 different classifiers, e.g.

    • 1. Bag of words

    • 2. LSTM

    • 3. CNN

    • 4. BERT

    • 5. HAN (Hierarchical Attention Networks)

    Annotator can label examples on which classifiers disagree.

  • Information density: In this approach, we focus on a denser region of data and select few points in each dense region. Labeling these data points help the model classify large number of data points around these points.

  • Business value: In this method, we focus on labeling the data points that have higher business value than the others.

Alignment Between Humans and Machines

Traditionally, data scientists work with annotators to label a portion of their data and hope for the best when training their model. If the model wasn’t sufficiently predictive, more data would be labeled, and they would try again until its performance reached an acceptable level. While this approach still makes sense for some problems, for those that have vast amounts of data or unstructured data, we find that active learning is a better solution.

Active learning combines the power of machine learning with human annotators to select the next best data points to label. This intelligent selection leads to the creation of high-performance models in less time and at lower cost.

The Artificial Intelligence & Machine Learning group is focused on increasing the volume and velocity of AI applications across the firm by helping develop common platforms, reusable services and solutions.