A Arms-On Introduction to cuML for GPU-Accelerated Machine Studying Workflows

On this article, you’ll study what cuML is, and the way it can considerably pace up the coaching of machine studying fashions by way of GPU acceleration.

Subjects we’ll cowl embody:

The intention and distinctive options of cuML.
How you can put together datasets and practice a machine studying mannequin for classification with cuML in a scikit-learn-like style.
How you can simply examine outcomes with an equal standard scikit-learn mannequin, when it comes to classification accuracy and coaching time.

Let’s not waste any extra time.

A Arms-On Introduction to cuML for GPU-Accelerated Machine Studying Workflows
Picture by Editor | ChatGPT

Introduction

This text gives a hands-on Python introduction to cuML, a Python library from RAPIDS AI (an open-source suite inside NVIDIA) for GPU-accelerated machine studying workflows throughout extensively used fashions. Along with its knowledge science–oriented sibling, cuDF, cuML has gained reputation amongst practitioners who want scalable, production-ready machine studying options.

The hands-on tutorial under makes use of cuML along with cuDF for GPU-accelerated dataset administration in a DataFrame format. For an introduction to cuDF, take a look at this associated article.

About cuML: An “Accelerated Scikit-Study”

RAPIDS cuML (quick for CUDA Machine Studying) is an open-source library that accelerates scikit-learn–fashion machine studying on NVIDIA GPUs. It offers drop-in replacements for a lot of common algorithms, typically decreasing coaching and inference occasions on massive datasets — with out main code adjustments or a steep studying curve for these conversant in scikit-learn.

Amongst its three most distinctive options:

cuML follows a scikit-learn-like API, easing the transition from CPU to GPU for machine studying with minimal code adjustments
It covers a broad set of strategies — all GPU-accelerated — together with regression, classification, ensemble strategies, clustering, and dimensionality discount
By way of tight integration with the RAPIDS ecosystem, cuML works hand-in-hand with cuDF for knowledge preprocessing, in addition to with associated libraries to facilitate end-to-end, GPU-native pipelines

Arms-On Introductory Instance

For example the fundamentals of cuML for constructing GPU-accelerated machine studying fashions, we’ll think about a pretty big, but simply accessible, dataset by way of public URL in Jason Brownlee’s repository: the grownup earnings dataset. It is a massive, barely class-unbalanced dataset meant for binary classification duties, particularly predicting whether or not an grownup’s earnings degree is excessive (above $50K) or low (under $50K) primarily based on a set of demographic and socio-economic options. Due to this fact, we intention to construct a binary classification mannequin.

IMPORTANT: To run the code under on Google Colab or an identical pocket book setting, be sure you change the runtime kind to GPU; in any other case, a warning will probably be raised indicating cuDF can’t discover the precise CUDA driver library it makes use of.

We begin by importing the mandatory libraries for our situation:

import cudf import cuml from cuml.model_selection import train_test_split as gpu_train_test_split from cuml.linear_model import LogisticRegression as cuLogReg from IPython.show import show import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression import time

import cudf

import cuml

from cuml.model_selection import train_test_split as gpu_train_test_split

from cuml.linear_model import LogisticRegression as cuLogReg

from IPython.show import show

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

import time

Be aware that, along with cuML modules and capabilities to separate the dataset and practice a logistic regression classifier, we’ve additionally imported their classical scikit-learn counterparts. Whereas not obligatory for utilizing cuML (as it really works independently from plain scikit-learn), we’re importing equal scikit-learn parts for the sake of comparability in the remainder of the instance.

Subsequent, we load the dataset right into a cuDF dataframe optimized for GPU utilization:

url = ” # Column names (they aren’t included within the dataset’s CSV file we’ll learn) cols = [ “age”,”workclass”,”fnlwgt”,”education”,”education_num”, “marital_status”,”occupation”,”relationship”,”race”,”sex”, “capital_gain”,”capital_loss”,”hours_per_week”,”native_country”,”income” ] df = cudf.read_csv(url, header=None, names=cols) show(df.head())

url = “https://uncooked.githubusercontent.com/jbrownlee/Datasets/grasp/adult-all.csv”

# Column names (they aren’t included within the dataset’s CSV file we’ll learn)

cols = [

“age”,“workclass”,“fnlwgt”,“education”,“education_num”,

“marital_status”,“occupation”,“relationship”,“race”,“sex”,

“capital_gain”,“capital_loss”,“hours_per_week”,“native_country”,“income”

]

df = cudf.read_csv(url, header=None, names=cols)

show(df.head())

As soon as the info is loaded, we establish the goal variable and convert it into binary (1 for prime earnings, 0 for low earnings):

df[“income”] = df[“income”].str.strip() df[“income”] = (df[“income”] == “>50K”).astype(“int32”)

df[“income”] = df[“income”].str.strip()

df[“income”] = (df[“income”] == “>50K”).astype(“int32”)

This dataset combines numeric options with a slight predominance of categorical ones. Most scikit-learn fashions — together with resolution timber and logistic regression — don’t natively deal with string-valued categorical options, so that they require encoding. An identical sample applies to cuML; therefore, we’ll choose a small variety of options to coach our classifier and one-hot encode the explicit ones.

# Function choice (as an example primarily based on area experience!) options = [“age”,”education_num”,”hours_per_week”,”workclass”,”occupation”,”sex”] X = df[features] y = df[“income”] # One-hot encode categorical options X_enc = cudf.get_dummies(X, drop_first=True) print(“Encoded characteristic form:”, X_enc.form)

# Function choice (as an example primarily based on area experience!)

options = [“age”,“education_num”,“hours_per_week”,“workclass”,“occupation”,“sex”]

X = df[features]

y = df[“income”]

# One-hot encode categorical options

X_enc = cudf.get_dummies(X, drop_first=True)

print(“Encoded characteristic form:”, X_enc.form)

To this point, we’ve used cuML (and likewise cuDF) very like utilizing classical scikit-learn together with Pandas.

Now comes the attention-grabbing half. We are going to break up the dataset into coaching and take a look at units and practice a logistic regression classifier twice, utilizing each CUDA GPU (cuML) and standalone scikit-learn. We are going to then examine each the classification accuracy and the time taken to coach every mannequin. Right here’s the entire code for the mannequin coaching and comparability:

# MODEL 1: GPU (cuML) train-test break up and coaching t0 = time.time() X_train, X_test, y_train, y_test = gpu_train_test_split(X_enc, y, test_size=0.2, random_state=42) model_gpu = cuLogReg(max_iter=1000) model_gpu.match(X_train, y_train) gpu_time = time.time() – t0 acc_gpu = model_gpu.rating(X_test, y_test) print(f”cuML Logistic Regression accuracy: {acc_gpu:.4f}, time: {gpu_time:.3f} sec”) # MODEL 2: Scikit-learn and Pandas-driven train-test break up and mannequin coaching df_pd = pd.read_csv(url, header=None, names=cols) df_pd[“income”] = df_pd[“income”].str.strip() df_pd[“income”] = (df_pd[“income”] == “>50K”).astype(“int32”) X_pd = df_pd[features] y_pd = df_pd[“income”] X_pd = pd.get_dummies(X_pd, drop_first=True) t0 = time.time() X_train_pd, X_test_pd, y_train_pd, y_test_pd = train_test_split(X_pd, y_pd, test_size=0.2, random_state=42) model_cpu = LogisticRegression(max_iter=1000) model_cpu.match(X_train_pd, y_train_pd) cpu_time = time.time() – t0 acc_cpu = model_cpu.rating(X_test_pd, y_test_pd) print(f”scikit-learn Logistic Regression accuracy: {acc_cpu:.4f}, time: {cpu_time:.3f} sec”)

# MODEL 1: GPU (cuML) train-test break up and coaching

t0 = time.time()

X_train, X_test, y_train, y_test = gpu_train_test_split(X_enc, y, test_size=0.2, random_state=42)

model_gpu = cuLogReg(max_iter=1000)

model_gpu.match(X_train, y_train)

gpu_time = time.time() – t0

acc_gpu = model_gpu.rating(X_test, y_test)

print(f“cuML Logistic Regression accuracy: {acc_gpu:.4f}, time: {gpu_time:.3f} sec”)

# MODEL 2: Scikit-learn and Pandas-driven train-test break up and mannequin coaching

df_pd = pd.read_csv(url, header=None, names=cols)

df_pd[“income”] = df_pd[“income”].str.strip()

df_pd[“income”] = (df_pd[“income”] == “>50K”).astype(“int32”)

X_pd = df_pd[features]

y_pd = df_pd[“income”]

X_pd = pd.get_dummies(X_pd, drop_first=True)

t0 = time.time()

X_train_pd, X_test_pd, y_train_pd, y_test_pd = train_test_split(X_pd, y_pd, test_size=0.2, random_state=42)

model_cpu = LogisticRegression(max_iter=1000)

model_cpu.match(X_train_pd, y_train_pd)

cpu_time = time.time() – t0

acc_cpu = model_cpu.rating(X_test_pd, y_test_pd)

print(f“scikit-learn Logistic Regression accuracy: {acc_cpu:.4f}, time: {cpu_time:.3f} sec”)

The outcomes are fairly attention-grabbing. They need to look one thing like:

cuML Logistic Regression accuracy: 0.8014, time: 0.428 sec scikit-learn Logistic Regression accuracy: 0.8097, time: 15.184 sec

cuML Logistic Regression accuracy: 0.8014, time: 0.428 sec

scikit–study Logistic Regression accuracy: 0.8097, time: 15.184 sec

As we are able to observe, the mannequin skilled with cuML achieved very related classification efficiency to its classical scikit-learn counterpart, but it surely skilled over an order of magnitude quicker: about 0.5 seconds in comparison with roughly 15 seconds for the scikit-learn classifier. Your actual numbers will differ with {hardware}, drivers, and library variations.

Wrapping Up

This text offered a delicate, hands-on introduction to the cuML library for enabling GPU-boosted building of machine studying fashions for classification, regression, clustering, and extra. By way of a easy comparability, we confirmed how cuML will help construct efficient fashions with considerably enhanced coaching effectivity.

Supply hyperlink

What's Hot

5 Causes AI-Pushed Enterprise Want Devoted Servers

A Petya/NotPetya copycat comes with a twist

Apple ends help for Clips video-editing app

A Arms-On Introduction to cuML for GPU-Accelerated Machine Studying Workflows

Modeling Extraordinarily Giant Photos with xT – The Berkeley Synthetic Intelligence Analysis Weblog

NVIDIA GB300 NVL72: Subsequent-generation AI infrastructure at scale

MIT Schwarzman Faculty of Computing and MBZUAI launch worldwide collaboration to form the way forward for AI | MIT Information

5 Causes AI-Pushed Enterprise Want Devoted Servers

A Petya/NotPetya copycat comes with a twist

Apple ends help for Clips video-editing app

6 Finest Carpet Cleaners (2025), Examined and Reviewed

About Us

Links

Resources

What's Hot

A Arms-On Introduction to cuML for GPU-Accelerated Machine Studying Workflows

Introduction

About cuML: An “Accelerated Scikit-Study”

Arms-On Introductory Instance

Wrapping Up

Related Posts

About Us

Links

Resources

Subscribe to Updates