Getting Started
Azure Data Explorer (ADX) commands can be broadly separated into two classes:
Data queries, which transact data on the platform
Control queries, which relate to the platform itself
This document will describe some examples of each and how they are structured. It does not attempt to cover the complete KQL syntax or all of the control options that are available. For further detailed refer to the API reference and Azure’s documentation.
The following examples assume that the package has previously been configured with connection and authentication details, see: ADXAuthentication.
The REST API interface used by this package can be somewhat verbose in some cases once familiar with its behavior it is generally straightforward to create wrapper functions in MATLAB that more concisely address a task. Some such reference implementations are provided.
MathWorks reserves the
internal
namespace and functions within this namespace are subject to change or removal without notice and should not be used.
Data queries
Hello World KQL query
run
command
The following reference implementation mathworks.adx.run
is a high-level
wrapper that abstracts the different types of queries under a generic run
method.
For more advanced use cases it may be required to drop to the lower-level interfaces
described below.
The query requests that “Hello World” be printed. This is returned as a table with a column names “mycol”. the success value is a logical indicating success of the call or not, additional optional outputs are not displayed in this example:
>> [result, success] = mathworks.adx.run('print mycol="Hello World"')
result =
table
mycol
_____________
"Hello World"
success =
logical
1
To get the contents of an entire table once can run the following command, here Parallel processing of the resulting data has been enabled (if available), with a threshold of 10000 rows of data. For more details on optimization see: Performance.md.
[result, success] = mathworks.adx.run('table("tableName", "all")', , useParallel=true, parallelThreshold=10000)
KQLQuery
command
At a lower-level the above run command in this case calls the KQLQuery
reference
implementation, here we also display the requestId
which can be used for corelation:
>> [result, success, requestId] = mathworks.adx.KQLQuery('print mycol="Hello World"')
result =
table
mycol
_____________
"Hello World"
success =
logical
1
requestId =
"4faa0b41-e308-4436-a113-376aa20da525"
REST interface
At a still lower-level one can work with the REST interface more directly.
A adx.data.models.QueryRequest
request object is first created. The .db
property
is configured with the database name of interest. If not provided a default, if set,
will be read from the configuration file. The query itself print Test="Hello World"
is assigned to the .csl
property. In this case no request properties are assigned
to the propertiesProp
property. This will be used in a later example.
% build a request object
req = adx.data.models.Query
Request;
req.db = "mydatabasename";
req.csl = 'print Test="Hello World"'
req =
QueryRequest with properties:
csl: "print mycol="Hello World""
db: "mydatabasename"
propertiesProp: []
The ADX cluster is assigned and with it a default scope. A default cluster and scope can be set in the configuration to avoid setting them for exact query and or having the values appear in source code. The ADX configuration may require that multiple scopes would be set.
% Configure a cluster and scope, defaults to be read from configuration files in future
cluster = "https://myclustername.myregion.kusto.windows.net";
scopes = ["https://myclustername.myregion.kusto.windows.net/.default"];
Now a adx.data.api.Query
query object is created using the scope(s).
% Create a query object
query = adx.data.api.Query('scopes', scopes);
query =
Query with properties:
serverUri: [0×0 matlab.net.URI]
httpOptions: [1×1 matlab.net.http.HTTPOptions]
preferredAuthMethod: [0×0 string]
bearerToken: '<unset>'
apiKey: '<unset>'
httpCredentials: '<unset>'
apiVersion: "2022-11-11"
subscriptionId: '<redacted>'
tenantId: '<redacted>'
clientSecret: '<redacted>'
clientId: '<redacted>'
scopes: "https://myclustername.myregion.kusto.windows.net/.default"
cookies: [1×1 adx.CookieJar]
Having configured the query object it can now be run the request that was previously created on a given cluster, if a default is not used:
[code, result, response] = query.queryRun(req, cluster)
code =
StatusCode enumeration
OK
result =
1×5 QueryV2ResponseRaw array with properties:
FrameType
Version
IsProgressive
TableId
TableKind
TableName
Columns
Rows
HasErrors
Cancelled
OneApiErrors
response =
ResponseMessage with properties:
StatusLine: 'HTTP/1.1 200 OK'
StatusCode: OK
Header: [1×11 matlab.net.http.HeaderField]
Body: [1×1 matlab.net.http.MessageBody]
Completed: 0
Assuming success the result is a number of Frames that describe and contain tabular data.
This data can be converted to native MATLAB tables for ease of use.
Three tables are returned 2 containing metadata and a 3rd containing the data of,
interest, it is named the PrimaryResult
.
tables = mathworks.internal.adx.queryV2Response2Tables(result)
tables =
3×1 cell array
{1×3 table}
{1×1 table}
{3×12 table}
>> tables{1}.Properties.Description
ans =
'@ExtendedProperties'
>> tables{2}.Properties.Description
ans =
'PrimaryResult'
>> tables{3}.Properties.Description
ans =
'QueryCompletionInformation'
>> tables
Looking at the PrimaryResult
shows the response to the query:
>> tables{2}
ans =
table
mycol
_______________
"Hello World"
Putting the commands together:
% build a request object
request = adx.data.models.QueryRequest();
colName = "myOutput";
message = "Hello World";
% set the KQL query
request.csl = sprintf('print %s="%s"', colName, message);
% Don't set the database use the default in .json config file
% request.db = "myDatabaseName"
% No adx.data.models.ClientRequestProperties required
% request.requestProperties
% Create the Query object and run the request
query = adx.data.api.Query();
% The default cluster to use is configured using a .json configuration file
% Run the query:
[code, result, response] = query.queryRun(request); %#ok<ASGLU>
if code == matlab.net.http.StatusCode.OK
% Convert the response to Tables
hwTable = mathworks.internal.adx.queryV2Response2Tables(result);
fprintf("Query (%s) result:\n", request.csl);
disp(hwTable);
else
error('Error running query: %s', request.csl);
end
Configuring request properties, to use tSQL using the low-level interface
The higher-level interface mathworks.adx.tSQLQuery
automatically configures
the required request property to enable tSQL. However, as an example of how properties
can be configured directly using the lower level the following example is illustrative.
Note the KQLQuery
and run
commands both accept properties as optional arguments.
If not using the KQL language syntax the the T-SQL syntax can be used to write
queries. This is accomplished the query property query_language
to sql
.
request = adx.data.models.QueryRequest();
request.csl = query;
% Don't set the database use the default in .json config file
% request.db = "databaseName";
% Configure ClientRequestPropertiesOptions & then ClientRequestProperties
crpo = adx.data.models.ClientRequestPropertiesOptions;
crpo.query_language = "sql";
crp = adx.data.models.ClientRequestProperties();
crp.Options = crpo;
request.requestProperties = crp;
% Create the Query object and run the request
query = adx.data.api.Query();
[code, result, response] = query.queryRun(request);
A reference implementation shows how this can be more concisely used:
% Run the SQL query 'SELECT top(10) * FROM mytable' to return the 1st 10 row of a table
[result, success, requestId, resultTables, dataSetHeader, dataSetCompletion] = mathworks.adx.tSQLQuery('SELECT top(10) * FROM mytable')
The additional return values can be used to indicate if the query failed:
if dataSetCompletion.HasErrors || dataSetCompletion.Cancelled
error("Query had errors or was cancelled");
end
Count the rows in a table, using the lower level interfaces
A query to count the rows in a table using KQL syntax:
req = adx.data.models.QueryRequest;
req.csl = "['mytable'] | count";
q = adx.data.api.Query();
[code, result, response] = q.queryRun(req); %#ok<ASGLU>
if code == matlab.net.http.StatusCode.OK
[result, resultTables] = mathworks.internal.adx.queryV2Response2Tables(result);
count = result.Count;
end
In more detail, this example is based on a table created from the airlinesmall.csv
data set that is included with matlab use which('airlinesmall.csv')
to get its
location. The result returned should be: 123523.
Note that in early releases some warnings about unsupported data types and conversions may be expected.
% Create a request
req = adx.ModelsDataPlane.QueryRequest;
req.db = "mydatabasename";
% The table name is: airlinesmallcsv;
% Kusto query string to count the rows in a given table
req.csl = "['airlinesmallcsv'] | count";
scopes = ["https://myclustername.myregion.kusto.windows.net/.default"];
cluster = "https://myclustername.myregion.kusto.windows.net";
q = adx.data.api.Query('scopes', scopes);
[code, result, response] = queryRun(q, req, cluster);
tables =
3×1 cell array
{1×3 table}
{1×1 table}
{3×12 table}
ans =
table
Count
______
123523
c =
int64
123523
ans =
'int64'
Progressive query execution
To execute a query in progressive mode the query request option property results_progressive_enabled
should be set to true.
query = "['tableName '] | take " + 100;
args = {"database", database "propertyNames", "results_progressive_enabled", "propertyValues", {true}, "verbose", false};
[result, success, requestId, resultTables, dataSetHeader, dataSetCompletion] = mathworks.adx.KQLQuery(query, args{:});
Get a list of tables
Get a list of tables and table properties:
req = adx.data.models.QueryRequest('db', 'mydatabase', 'csl', '.show tables details');
[code, result, response] = q.queryRun(req);
Or more concisely:
tableList = mathworks.adx.listTables(database="mydatabase")
Export data to a local parquet file
The following example code exports an entire table to a known blob using
a Shared Access Signature blob URL. The resulting parquet file can be read
into a MATLAB table using parquetread
. Parquet is recommended over CSV and
other formats for speed and data integrity reasons.
exportURL = "https://myaccount.blob.core.windows.net/<REDACTED>";
exportURI = matlab.net.URI(exportURL);
SAS = exportURI.EncodedQuery;
query = "table('mytableName', 'all')";
[tf, result] = mathworks.adx.exportToBlob(exportURI.EncodedURI, query);
if ~tf
error("exportToBlob failed");
end
downloadURL = result.Path(1) + "?" + SAS;
downloadURI = matlab.net.URI(downloadURL);
localFile = websave("exportedTable.gz.parquet", downloadURI);
T = parquetread(localFile);
Ingest a table from a local a table
To ingest large volumes of data from MATLAB then the ingestFile
and ingestTable
functions can be used:
% Read some sample data from a parquet file to create a MATLAB table
inputTable = parquetread(parquetFile);
% Ingest the table into a given table
[success, result] = mathworks.adx.ingestTable(inputTable, tableName=tableName);
To ingest small amounts of data from a MATLAB variable, typically a table the
ingestInline
. This function is not suitable for bulk data or high performance
requirements.
localPath = fullfile(matlabroot, "toolbox", "matlab", "demos", "outages.parquet");
tableName = "outages";
praquetTable = parquetread(localPath);
ingestData = praquetTable(1,:);
[success, result, requestId, extentId] = mathworks.adx.ingestInline(tableName, ingestData)
success =
logical
1
result =
1x5 table
ExtentId ItemLoaded Duration HasErrors OperationId
______________________________________ _____________________________________________ ________ _________ ______________________________________
"8de6b799-6e12-4994-b57b-ed75e15db0a8" "inproc:a607e293-dbdd-4f79-a1a2-a61982585adf" 00:00:00 false "cd4184ca-0d31-4c42-a273-5f2953f76ddf"
requestId =
"63bb1cea-b589-45ac-82ad-00d68ca96aeb"
extentId =
"8de6b799-6e12-4994-b57b-ed75e15db0a8"
To ingest from another source in ADX itself rather than MATLAB see ingestFromQuery
.
Higher-level data handling functions
The following higher-level functions are provided to assist in common operations when working with data.
Use doc mathworks.adx.<function-name>
for more details.
createTable - Creates a table in a given database
dropTable - Drops a table from a given database
exportToBlob - Exports data to an Azure blob
ingestFile - Ingests a local file to Azure Data Explorer using Azure blob
ingestFileQueue - Ingests a local file to Azure Data Explorer using Azure blob & queue (work in progress, do not use)
ingestTable - Ingests a MATLAB table to an Azure Data Explorer Table
ingestTableQueue - Ingests a MATLAB table to an Azure Data Explorer Table (work in progress, do not use)
ingestInline - Ingests limited amounts of data into Kusto directly from MATLAB
ingestFromQuery - Ingest data using the result of a command or query
listTables - Returns a list of tables and their properties
tableExists - Returns true is a given table exists
Control queries
List Clusters
Start by creating a Clusters
object:
>> clusters = adx.Api.Clusters
clusters =
Clusters with properties:
serverUri: [0×0 matlab.net.URI]
httpOptions: [1×1 matlab.net.http.HTTPOptions]
preferredAuthMethod: [0×0 string]
bearerToken: '<unset>'
apiKey: '<unset>'
httpCredentials: '<unset>'
apiVersion: "2022-11-11"
subscriptionId: '<redacted>'
tenantId: '<redacted>'
clientSecret: '<redacted>'
clientId: '<redacted>'
cookies: [1×1 adx.CookieJar]
Call the clustersList
method:
>> [code, result, response] = clusters.clustersList
code =
StatusCode enumeration
OK
result =
ClusterListResult with properties:
value: [1×1 adx.Models.Cluster]
response =
ResponseMessage with properties:
StatusLine: 'HTTP/1.1 200 OK'
StatusCode: OK
Header: [1×13 matlab.net.http.HeaderField]
Body: [1×1 matlab.net.http.MessageBody]
Completed: 0
Examine the result, in this case there is one cluster:
>> result.value
ans =
Cluster with properties:
sku: [1×1 adx.Models.AzureSku]
systemData: [0×0 adx.Models.systemData]
zones: "1"
identity: [1×1 adx.Models.Identity]
xproperties: [1×1 adx.Models.ClusterProperties_1]
etag: ""2023-01-04T12:40:35.3452388Z""
id: "/subscriptions/06<REDACTED>74/resourceGroups/mbadx/providers/Microsoft.Kusto/Clusters/myclustername"
name: "myclustername"
type: "Microsoft.Kusto/Clusters"
>> result.value.sku
ans =
AzureSku with properties:
name: Dev_No_SLA_Standard_E2a_v4
capacity: 1
tier: Basic
The
[tf, cluster] = mathworks.adx.isClusterRunning(...)
command is a convenient function to easily determine if a default or given cluster is running or not.
Management
% Get Identity token
m = adx.data.api.Management('scopes', dataPlaneScopes)
req = adx.data.models.ManagementRequest
req.csl = '.get kusto identity token'
ingestCluster = "https://ingest-myadxcluster.westeurope.kusto.windows.net"
[code, result, response] = m.managementRun(req, ingestCluster)
q = adx.data.api.Ingest()
More concisely using configured defaults:
m = adx.data.api.Management()
req = adx.data.models.ManagementRequest
req.csl = '.get kusto identity token'
[code, result, response] = m.managementRun(req)
References
For more sample commands see:
Example code:
Software/MATLAB/adxDemo.m
Basic test code:
Software/MATLAB/test/unit/*.m
For further API reference information see: