What is human-assisted machine learning? Better data and more effective models

207

What is human-assisted machine learning? Better data and more effective models

The human and machine learning loop uses human feedback to eliminate errors in the training data and increase the accuracy of the model.

Machine learning models are often far from perfect. When using the model’s predictions for purposes that affect people’s lives, such as a credit approval rating, a human is advised to review at least some of the predictions: those with a low confidence level, those that fall out of range, and a quality control random sample.

Additionally, the lack of good tagged (annotated) data often makes supervised learning difficult (unless you’re a professor where your students have nothing to do). One way to implement semi-supervised learning on untagged data is for people to label some data for model seeding, use high-confidence predictions to the interim model (or transfer model learning) to tag more data (self-tagging), and send predictions about low-confidence to check Human (active learning). The process can be repeated and tends to improve from transition to transition in practice.

Check also:

In short, human loop machine learning relies on human feedback to improve the quality of the data used to train machine learning models. In general, human machine learning is all about sampling good data that a human can tag (annotation), using that data to train the model, and using the model to sample more data for annotation. There are many services available to manage this process.

Amazon SageMaker Ground Truth

Amazon SageMaker offers two data description services: Amazon SageMaker Ground Truth Plus and Amazon SageMaker Ground Truth. Both options identify raw data such as images, text, and videos, and add informational labels to create high-quality training data sets for machine learning models. With Ground Truth Plus, Amazon experts set up data label workflows for you, and in the process, advance learning and automated validation of labeling are applied.

amazon augmented artificial intelligence

While Amazon SageMaker Ground Truth deals with data pre-classification, Amazon Augmented AI (Amazon A2I) provides human validation of low-confidence predictions or random prediction samples from implemented models. Enhanced AI manages both the review workflow creation and human reviewers. Integrates with AWS AI and machine learning services as well as models deployed on the Amazon SageMaker endpoint.

DataRobot – Human in the Ring

DataRobot has a modest AI feature that allows you to set rules to detect uncertain predictions, external inputs, and low-monitored areas. These rules can lead to three possible actions: no action (monitoring only); invalidate the prediction (usually with a “safe” value); or return an error (reject the prediction). DataRobot wrote documentation about human in the loop, but I couldn’t find any implementation on their site other than the rules of humility.

Google Cloud Human-in-the-Loop

Google Cloud offers Human-in-the-Loop (HITL) processing built into its Document AI services, but as in this writing, nothing for image or video processing. Currently, Google supports HITL review workflows for the following processors:

Order Processors:

Invoices

receipts

Loan Processors:

1003 Analyst

1040 Parser

1040 Analyzer Table C

1040 Table E Parser

1099-DIV Analyzer

1099-G Analyzer

1099-INT . Analyzer

1099-MISC Analyzer

bank statement analyst

Analyzed statement by HOA Analyst

Mortgage Extract Analyzer

Pay Coupon Analyzer

Retirement Analyst / Investment Statement

W2 محلل Analyzer

W9 Parser

human service programs

It can be difficult to configure human image annotations, such as image classification, object detection, and semantic segmentation, to label datasets. Fortunately, there are many good commercial and open source tools that labelers can use.

Humans in the Loop, a company that describes itself as a “social enterprise that delivers ethical workforce solutions to power the AI industry,” blogs periodically about its favorite annotation tools. In their most recent entries, they have listed 10 open source computer vision annotation tools: Label Studio, Diffgram, LabelImg, CVAT, ImageTagger, LabelMe, VIA, Make Sense, COCO Annotator, and DataTurks. These tools are often used to annotate training sets, and some can manage annotation sets.

For example, Computer Vision Annotation Tool (CVAT) is powerful, up-to-date and works in Chrome. It’s still one of the main tools we and our customers use for labeling, because it’s much faster than many tools on the market.”

CVAT README on GitHub writes, “CVAT is a free web-based interactive image and video annotation tool for computer vision. It is used by our team to annotate millions of objects with different properties. Many user interface and user experience decisions are based on feedback from Professional data annotation teams. Try it online at cvat.org.” It is necessary to create a login to run the demo.

CVAT is released as open source under the MIT license. Most of Intel’s active obligors operate in Nizhny Novgorod, Russia. The CVAT introductory video shows how the labeling process works.

What is human-assisted machine learning? Better data, more effective models — I.D.G

As you can see, human loop processing can contribute to the machine learning process at two points: the initial creation of data sets tagged for supervised learning, and the review and correction of potentially problematic predictions when the model is run. The first use case helps smooth the model and the second case helps set the model.

Source: InfoWorld

MR.abdulkader

207