In my last post, I wrote about how to use ML_TRAIN in MySQL AI to build machine learning models directly inside the database, without sending data out to Python or other tools. This time, the focus is on the next step, how to use those trained models to detect fraud, both in real time and on past data, using ML_PREDICT.​

The example here uses a simple credit card fraud detection use case, but the same pattern works for many classification problems inside MySQL AI.

ML_PREDICT in MySQL AI

In MySQL AI, ML_PREDICT is the part that turns a trained model into predictions. ML_TRAIN is used to create the model from a table, and ML_PREDICT is used to apply that model to new data.​

MySQL exposes prediction through two main routines:​

  • ML_PREDICT_ROW – predicts for one or more rows passed in as JSON. This is best for real-time or small batches, for example, a payment request that needs an immediate fraud decision.​
  • ML_PREDICT_TABLE – predicts for an entire table and writes the results into an output table. This is best for batch jobs, audits, and analysis on many rows.​

Both of these use a model handle that points to a model you already trained and loaded with ML_TRAIN and ML_MODEL_LOAD.​@fraud_model that points to a model you already trained and loaded with ML_TRAIN and ML_MODEL_LOAD.​

ML_PREDICT inside the database

For fraud detection, keeping the whole pipeline inside the database brings several benefits:

  • No data movement: Transaction data stays in MySQL, there is no need to export to external ML services or files, which keeps things simpler and more secure.​
  • Low latency: Because the model is stored and executed inside MySQL AI, you can call ML_PREDICT_ROW directly in your application flow and get a result fast enough for real-time decisions.​
  • Familiar tools for DBAs: Models are managed with SQL routines and privileges, and can be backed up and audited like other database objects.​
  • Same features for train and predict: The input tables for ML_PREDICT use the same column names as the training table, which helps avoid mismatch between the model and the data.​

This means you can add fraud detection logic to an existing MySQL-based system without introducing a separate ML stack.

How the fraud model drives every prediction

One important point,MySQL AI does not randomly scan your database and “discover” fraud. It uses the model that you trained earlier.​

In this example, assume there is a table fraud_data.transactions_train with historical transactions and a target column is_fraud (1 = fraud, 0 = normal). When you call ML_TRAIN on this table and target, MySQL AI learns patterns based on columns like amount, country, merchant_category, channel, device_score, hour_of_day, and so on.​

The result of ML_TRAIN is stored in a model catalog and referenced by a handle, for example @fraud_model. When you later call ML_PREDICT_ROW or ML_PREDICT_TABLE, these routines use that same @fraud_model to calculate:​

  • a Prediction (fraud or not fraud), and
  • probabilities in ml_results.​

So every prediction in this blog is based on the fraud model you trained, not some generic model.

Architecture: in-database fraud detection

This diagram shows how MySQL AI runs fraud detection entirely inside the database.

A model trained with ML_TRAIN is loaded once and used by ML_PREDICT_ROW for live scoring and ML_PREDICT_TABLE for batch analysis.

Both real-time and historical checks use the same fraud model.

Training and loading the fraud model

Before MySQL AI, building a fraud detection model usually meant exporting data from MySQL into an external environment such as Python, training a model with libraries like scikit‑learn or TensorFlow, saving it to a file, and then wiring a separate service or API to score transactions and push results back into the database. This added extra infrastructure, more moving parts, and often higher latency between the database and the fraud engine.

With MySQL AI, the same workflow moves inside the database, you train the fraud model directly on your transaction table using ML_TRAIN, store the model in the MySQL AI catalog, and load it into memory with ML_MODEL_LOAD so that ML_PREDICT_ROW and ML_PREDICT_TABLE can use it for in‑database predictions.

Before using ML_PREDICT, you must have a trained and loaded fraud model. The following example trains a binary classification model on historical transactions and then loads it into memory so it can be used for prediction.

mysql> CALL sys.ML_TRAIN(
    ->   'fraud_data.transactions_train',
    ->   'is_fraud',
    ->   JSON_OBJECT('task', 'classification'),
    ->   @fraud_model
    -> );
+--------------------------------------------------------------+
| status                                                       |
+--------------------------------------------------------------+
| Starting model training: fraud_model                         |
+--------------------------------------------------------------+
1 row in set (0.00 sec)

+--------------------------------------------------------------+
| message                                                      |
+--------------------------------------------------------------+
| Analyzing dataset columns...                                 |
| Target column detected: is_fraud (categorical)               |
| Detected 8 feature columns.                                  |
| Missing values handled via imputation.                       |
| Data normalization applied.                                  |
+--------------------------------------------------------------+
5 rows in set (1.20 sec)

+--------------------------------------------------------------+
| message                                                      |
+--------------------------------------------------------------+
| Auto algorithm selection: Binary Classification detected.    |
| Candidate algorithms evaluated: Tree, Logistic Regression.   |
| Selected: Tree (highest validation F1-score).                |
| Training model on 80% of dataset...                          |
| Model F1-score (validation): 0.93                            |
+--------------------------------------------------------------+
5 rows in set (3.40 sec)

Query OK, 0 rows affected (3.41 sec)

Load the trained model into memory:

mysql> CALL sys.ML_MODEL_LOAD(@fraud_model, NULL);
+----------------------------------------+
| status                                 |
+----------------------------------------+
| Loading model: fraud_model             |
+----------------------------------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.35 sec)

Now @fraud_model is ready to be used by ML_PREDICT_ROW and ML_PREDICT_TABLE.

Real-time fraud detection with ML_PREDICT_ROW

For real-time detection, the goal is to decide during the card transaction whether it looks suspicious or not. A common pattern is:

  • Application receives a payment request.
  • Application builds a JSON object with all the features required by the model.
  • Application calls ML_PREDICT_ROW against MySQL AI.
  • Application decides to approve, block, or send for review based on the prediction and probability.

Step 1: Build JSON input for a single transaction





mysql> SET @txn_input = JSON_OBJECT(
    ->   "amount",             950.00,
    ->   "country",            "RU",
    ->   "merchant_category",  "jewelry",
    ->   "channel",            "ONLINE",
    ->   "device_score",       0.35,
    ->   "hour_of_day",        1,
    ->   "card_age_months",    2,
    ->   "prev_24h_txn_count", 14
    -> );
Query OK, 0 rows affected (0.00 sec)

The JSON keys must match the feature column names used in the training dataset.

Step 2: Call ML_PREDICT_ROW

mysql> SELECT sys.ML_PREDICT_ROW(@txn_input, @fraud_model, NULL)\G
*************************** 1. row ***************************
sys.ML_PREDICT_ROW(@txn_input, @fraud_model, NULL): {
  "amount": 950,
  "country": "RU",
  "merchant_category": "jewelry",
  "channel": "ONLINE",
  "device_score": 0.35,
  "hour_of_day": 1,
  "card_age_months": 2,
  "prev_24h_txn_count": 14,
  "Prediction": 1,
  "ml_results": {
    "predictions": { "is_fraud": 1 },
    "probabilities": { "0": 0.15, "1": 0.85 }
  }
}
1 row in set (0.84 sec)

Here this particular credit card transaction is classified as fraud: Prediction = 1 with an 85% fraud probability.

Step 3: Decision logic in the application

The actual decision is usually implemented in the application or service layer:

  • If fraud probability ≥ 0.9 → Block the transaction and log it for investigation.
  • If fraud probability between 0.6 and 0.9 → Mark for manual review.
  • If fraud probability < 0.6 → Approve normally.

Because the model runs in MySQL, the only extra cost is one SQL call during the payment flow.uring the payment flow.low.

Checking several live transactions with ML_PREDICT_ROW

You can also score a small batch of recent transactions while still using ML_PREDICT_ROW. In this pattern, you build the JSON inside a SELECT using JSON_OBJECT(…). This lets you see both normal and fraudulent predictions side by side.

Example: score the latest 3 unreviewed transactions:

mysql> SELECT
    ->   t.transaction_id,
    ->   sys.ML_PREDICT_ROW(
    ->     JSON_OBJECT(
    ->       "amount",             t.amount,
    ->       "country",            t.country,
    ->       "merchant_category",  t.merchant_category,
    ->       "channel",            t.channel,
    ->       "device_score",       t.device_score,
    ->       "hour_of_day",        t.hour_of_day,
    ->       "card_age_months",    t.card_age_months,
    ->       "prev_24h_txn_count", t.prev_24h_txn_count
    ->     ),
    ->     @fraud_model,
    ->     NULL
    ->   ) AS fraud_prediction
    -> FROM fraud_data.transactions_live AS t
    -> WHERE t.is_fraud IS NULL
    -> ORDER BY t.event_time DESC
    -> LIMIT 3\G
*************************** 1. row ***************************
transaction_id: 50123
fraud_prediction: {
  "amount": 450,
  "country": "BR",
  "merchant_category": "luxury_fashion",
  "channel": "ONLINE",
  "device_score": 0.41,
  "hour_of_day": 1,
  "card_age_months": 2,
  "prev_24h_txn_count": 15,
  "Prediction": 1,
  "ml_results": {
    "predictions": { "is_fraud": 1 },
    "probabilities": { "0": 0.10, "1": 0.90 }
  }
}
*************************** 2. row ***************************
transaction_id: 50124
fraud_prediction: {
  "amount": 21.5,
  "country": "IN",
  "merchant_category": "grocery",
  "channel": "POS",
  "device_score": 0.96,
  "hour_of_day": 14,
  "card_age_months": 36,
  "prev_24h_txn_count": 3,
  "Prediction": 0,
  "ml_results": {
    "predictions": { "is_fraud": 0 },
    "probabilities": { "0": 0.97, "1": 0.03 }
  }
}
*************************** 3. row ***************************
transaction_id: 50125
fraud_prediction: {
  "amount": 310,
  "country": "US",
  "merchant_category": "electronics",
  "channel": "ONLINE",
  "device_score": 0.55,
  "hour_of_day": 23,
  "card_age_months": 5,
  "prev_24h_txn_count": 8,
  "Prediction": 1,
  "ml_results": {
    "predictions": { "is_fraud": 1 },
    "probabilities": { "0": 0.28, "1": 0.72 }
  }
}
3 rows in set (1.23 sec)

In this example:

  • 50123 and 50125 are predicted as fraud with high and medium‑high probabilities.
  • 50124 is predicted as a normal grocery POS transaction with 97% confidence.

This pattern is useful for near‑real‑time scoring in dashboards or small background jobs.

Batch fraud analysis with ML_PREDICT_TABLE

For background jobs, audits, and historical analysis, ML_PREDICT_TABLE is more convenient. It can process many rows at once and write predictions into a separate table.​

Assume you have a table fraud_data.transactions_unlabeled with older transactions that were never scored by the model. You can generate predictions for the whole table like this:

Step 1: Generate predictions for a table

mysql> CALL sys.ML_PREDICT_TABLE(
-> 'fraud_data.transactions_unlabeled',
-> @fraud_model,
-> 'fraud_data.transactions_unlabeled_predictions',
-> NULL
-> );
+------------------------------------------------------------------+
| status |
+------------------------------------------------------------------+
| Starting ML_PREDICT_TABLE on fraud_data.transactions_unlabeled |
+------------------------------------------------------------------+
1 row in set (0.00 sec)

+--------------------------------------------------------------+
| message |
+--------------------------------------------------------------+
| Validating input table schema... |
| Feature columns matched training dataset. |
| Launching parallel prediction tasks... |
| Writing predictions to fraud_data.transactions_unlabeled_predictions |
| Completed predictions for 250000 rows. |
+--------------------------------------------------------------+
5 rows in set (12.34 sec)

Query OK, 0 rows affected (12.35 sec)


MySQL AI will validate the schema, run predictions in parallel, and create the output table with predictions and an ml_results JSON column.​

Step 2: Inspect prediction output

mysql> SELECT
    ->   transaction_id,
    ->   amount,
    ->   country,
    ->   Prediction,
    ->   ml_results
    -> FROM fraud_data.transactions_unlabeled_predictions
    -> LIMIT 5;
+----------------+--------+---------+------------+-------------------------------------------------------------------------------------+
| transaction_id | amount | country | Prediction | ml_results                                                                          |
+----------------+--------+---------+------------+-------------------------------------------------------------------------------------+
|          60001 | 980.00 | RU      |          1 | {"predictions":{"is_fraud":1},"probabilities":{"0":0.11,"1":0.89}}                 |
|          60002 | 45.00  | US      |          0 | {"predictions":{"is_fraud":0},"probabilities":{"0":0.96,"1":0.04}}                 |
|          60003 | 380.00 | BR      |          1 | {"predictions":{"is_fraud":1},"probabilities":{"0":0.25,"1":0.75}}                 |
|          60004 | 12.50  | IN      |          0 | {"predictions":{"is_fraud":0},"probabilities":{"0":0.98,"1":0.02}}                 |
|          60005 | 700.00 | US      |          1 | {"predictions":{"is_fraud":1},"probabilities":{"0":0.40,"1":0.60}}                 |
+----------------+--------+---------+------------+-------------------------------------------------------------------------------------+
5 rows in set (0.01 sec)

You can now join this table back to your main transactions table to label old data, build risk reports, or check how the model behaves on past incidents.

Automated vs manual flows

You can use these routines in different ways:

  • Fully automated: The payment API calls ML_PREDICT_ROW for every transaction and decides automatically based on probability thresholds.​
  • Background process: A scheduled job periodically runs ML_PREDICT_TABLE on new transactions and marks risky ones in the database.​
  • Hybrid: Clear high/low risk cases are handled automatically, while medium-risk cases are sent for manual review in a dashboard.​

In all cases, the key idea is the same, the model you trained with ML_TRAIN is reused inside MySQL to drive fraud decisions.​

Final thoughts

ML_TRAIN and ML_PREDICT together make MySQL AI very attractive for DBAs and engineers who want to bring machine learning closer to their data.​

With just SQL:

  • You train a fraud model on historical transactions.
  • You load it with ML_MODEL_LOAD.
  • You use ML_PREDICT_ROW for real-time checks and ML_PREDICT_TABLE for batch analysis.​

There is no need to push data out to external ML services, and you can keep using the tools and workflows that you already know from the MySQL world.

Discover more from Genexdbs

Subscribe now to keep reading and get access to the full archive.

Continue reading