Install Alauda AI

Alauda AI now offers flexible deployment options. Starting with Alauda AI 1.4, the Knative capability is an optional feature, allowing for a more streamlined installation if it's not needed.

To begin, you will need to deploy the Alauda AI Operator. This is the core engine for all Alauda AI products. By default, it uses the KServe Standard mode for the inference backend, which is particularly recommended for resource-intensive generative workloads. This mode provides a straightforward way to deploy models and offers robust, customizable deployment capabilities by leveraging foundational Kubernetes functionalities.

If your use case requires Knative functionality, which enables advanced features like scaling to zero on demand for cost optimization, you can optionally install the Knative Operator. This operator is not part of the default installation and can be added at any time to enable Knative functionality.

INFO

Recommended deployment option: For generative inference workloads, the Standard approach (previously known as RawKubernetes Deployment) is recommended as it provides the most control over resource allocation and scaling.

Downloading

Operator Components:

  • Alauda AI Operator

    Alauda AI Operator is the main engine that powers Alauda AI products. It focuses on two core functions: model management and inference services, and provides a flexible framework that can be easily expanded.

    Download package: aml-operator.xxx.tgz

  • Knative Operator

    Knative Operator provides serverless model inference.

    Download package: knative-operator.ALL.v1.x.x-yymmdd.tgz

INFO

You can download the app named 'Alauda AI' and 'Knative Operator' from the Marketplace on the Customer Portal website.

Uploading

We need to upload both Alauda AI and Knative Operator to the cluster where Alauda AI is to be used.

Downloading the violet tool

First, we need to download the violet tool if not present on the machine.

Log into the Web Console and switch to the Administrator view:

  1. Click Marketplace / Upload Packages.
  2. Click Download Packaging and Listing Tool.
  3. Locate the right OS / CPU architecture under Execution Environment.
  4. Click Download to download the violet tool.
  5. Run chmod +x ${PATH_TO_THE_VIOLET_TOOL} to make the tool executable.

Uploading package

Save the following script in uploading-ai-cluster-packages.sh first, then read the comments below to update environment variables for configuration in that script.

uploading-ai-cluster-packages.sh
#!/usr/bin/env bash
export PLATFORM_ADDRESS=https://platform-address  
export PLATFORM_ADMIN_USER=<admin>
export PLATFORM_ADMIN_PASSWORD=<admin-password>
export CLUSTER=<cluster-name>

export AI_CLUSTER_OPERATOR_NAME=<path-to-aml-operator-tarball>
export KNATIVE_OPERATOR_PKG_NAME=<path-to-knative-operator-tarball>

VIOLET_EXTRA_ARGS=()
IS_EXTERNAL_REGISTRY=

# If the image registry type of destination cluster is not platform built-in (external private or public repository).
# Additional configuration is required (uncomment following line):
# IS_EXTERNAL_REGISTRY=true
if [[ "${IS_EXTERNAL_REGISTRY}" == "true" ]]; then
    REGISTRY_ADDRESS=<external-registry-url>
    REGISTRY_USERNAME=<registry-username>
    REGISTRY_PASSWORD=<registry-password>

    VIOLET_EXTRA_ARGS+=(
        --dst-repo "${REGISTRY_ADDRESS}"
        --username "${REGISTRY_USERNAME}"
        --password "${REGISTRY_PASSWORD}"
    )
fi

# Push **Alauda AI Cluster** operator package to destination cluster
violet push \
    ${AI_CLUSTER_OPERATOR_NAME} \
    --platform-address=${PLATFORM_ADDRESS} \
    --platform-username=${PLATFORM_ADMIN_USER} \
    --platform-password=${PLATFORM_ADMIN_PASSWORD} \
    --clusters=${CLUSTER} \
    ${VIOLET_EXTRA_ARGS[@]}

# Push **Knative Operator** package to destination cluster
violet push \
    ${KNATIVE_OPERATOR_PKG_NAME} \
    --platform-address=${PLATFORM_ADDRESS} \
    --platform-username=${PLATFORM_ADMIN_USER} \
    --platform-password=${PLATFORM_ADMIN_PASSWORD} \
    --clusters=${CLUSTER} \
    ${VIOLET_EXTRA_ARGS[@]}
  1. ${PLATFORM_ADDRESS} is your ACP platform address.
  2. ${PLATFORM_ADMIN_USER} is the username of the ACP platform admin.
  3. ${PLATFORM_ADMIN_PASSWORD} is the password of the ACP platform admin.
  4. ${CLUSTER} is the name of the cluster to install the Alauda AI components into.
  5. ${AI_CLUSTER_OPERATOR_NAME} is the path to the Alauda AI Cluster Operator package tarball.
  6. ${KNATIVE_OPERATOR_PKG_NAME} is the path to the Knative Operator package tarball.
  7. ${REGISTRY_ADDRESS} is the address of the external registry.
  8. ${REGISTRY_USERNAME} is the username of the external registry.
  9. ${REGISTRY_PASSWORD} is the password of the external registry.

After configuration, execute the script file using bash ./uploading-ai-cluster-packages.sh to upload both Alauda AI and Knative Operator.

Installing Alauda AI Operator

Procedure

In Administrator view:

  1. Click Marketplace / OperatorHub.

  2. At the top of the console, from the Cluster dropdown list, select the destination cluster where you want to install Alauda AI.

  3. Select Alauda AI, then click Install.

    Install Alauda AI window will pop up.

  4. Then in the Install Alauda AI window.

  5. Leave Channel unchanged.

  6. Check whether the Version matches the Alauda AI version you want to install.

  7. Leave Installation Location unchanged, it should be aml-operator by default.

  8. Select Manual for Upgrade Strategy.

  9. Click Install.

Verification

Confirm that the Alauda AI tile shows one of the following states:

  • Installing: installation is in progress; wait for this to change to Installed.
  • Installed: installation is complete.

Installing Alauda Build of KServe Operator

For detailed installation steps, see Install KServe in Alauda Build of KServe.

Enabling Knative Functionality

Knative functionality is an optional capability that requires an additional operator and instance to be deployed.

WARNING

If you plan to use Knative functionality, you MUST install the Knative Operator and create the Knative Serving instance BEFORE configuring the Alauda AI instance to ensure the required CRDs are available in the cluster.

1. Installing the Knative Operator

INFO

Starting from Knative Operator, the Knative networking layer switches to Kourier, so installing Istio is no longer required.

Procedure

In Administrator view:

  1. Click Marketplace / OperatorHub.

  2. At the top of the console, from the Cluster dropdown list, select the destination cluster where you want to install.

  3. Search for and select Knative Operator, then click Install.

    Install Knative Operator window will pop up.

  4. Then in the Install Knative Operator window.

  5. Leave Channel unchanged.

  6. Check whether the Version matches the Knative Operator version you want to install.

  7. Leave Installation Location unchanged.

  8. Select Manual for Upgrade Strategy.

  9. Click Install.

Verification

Confirm that the Knative Operator tile shows one of the following states:

  • Installing: installation is in progress; wait for this to change to Installed.
  • Installed: installation is complete.

2. Creating Knative Serving Instance

Once Knative Operator is installed, you need to create the KnativeServing instance manually.

Procedure

  1. Create the knative-serving namespace.

    kubectl create ns knative-serving
  2. In the Administrator view, navigate to Operators -> Installed Operators.

  3. Select the Knative Operator.

  4. Under Provided APIs, locate KnativeServing and click Create Instance.

  5. Switch to YAML view.

  6. Replace the content with the following YAML:

  7. Click Create.

    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
    metadata:
      name: knative-serving
      namespace: knative-serving
    spec:
      # For ACP 4.0, use version 1.18.1
      # For ACP 4.1 and above, use version 1.19.6
      version: "1.19.6"
      config:
        deployment:
          registries-skipping-tag-resolving: kind.local,ko.local,dev.local,private-registry
        domain:
          example.com: ""
        features:
          kubernetes.podspec-affinity: enabled
          kubernetes.podspec-hostipc: enabled
          kubernetes.podspec-hostnetwork: enabled
          kubernetes.podspec-init-containers: enabled
          kubernetes.podspec-nodeselector: enabled
          kubernetes.podspec-persistent-volume-claim: enabled
          kubernetes.podspec-persistent-volume-write: enabled
          kubernetes.podspec-securitycontext: enabled
          kubernetes.podspec-tolerations: enabled
          kubernetes.podspec-volumes-emptydir: enabled
          queueproxy.resource-defaults: enabled
        network:
          domain-template: '{{.Name}}.{{.Namespace}}.{{.Domain}}'
          ingress-class: kourier.ingress.networking.knative.dev
      ingress:
        kourier:
          enabled: true
WARNING
  • For ACP 4.0, use version 1.18.1
  • For ACP 4.1 and above, use version 1.19.6
  1. Specify the version of Knative Serving to be deployed.

  2. private-registry is a placeholder for your private registry address. You can find this in the Administrator view, then click Clusters, select your cluster, and check the Private Registry value in the Basic Info section.

Configuring Alauda AI Instance

Once Alauda AI Operator (and optionally, Knative Operator) is installed, the operator automatically creates the default AmlCluster instance. You do not need to create the default instance manually. Review and update the automatically created instance according to your environment.

Procedure

In Administrator view:

  1. Click Marketplace / OperatorHub.

  2. At the top of the console, from the Cluster dropdown list, select the destination cluster where you want to install the Alauda AI Operator.

  3. Select Alauda AI, then click it.

  4. In the Alauda AI page, click All Instances from the tab.

  5. Wait for the default AmlCluster instance to appear, then edit it.

  6. Select Deploy Flavor from dropdown:

    1. single-node for non HA deployments.
    2. ha-cluster for HA cluster deployments (Recommended for production).
  7. Set KServe Mode to Managed.

  8. Input a valid domain for Domain field.

    INFO

    This domain is used by ingress gateway for exposing model serving services. Most likely, you will want to use a wildcard name, like *.example.com.

    You can specify the following certificate types by updating the Domain Certificate Type field:

    • Provided
    • SelfSigned
    • ACPDefaultIngress

    By default, the configuration uses SelfSigned certificate type for securing ingress traffic to your cluster, the certificate is stored in the knative-serving-cert secret that is specified in the Domain Certificate Secret field.

  9. (Optional) Configure a custom OIDC provider.

    By default, Alauda AI uses ACP Dex as the OIDC provider. In this default setup, no additional spec.oidc configuration is required in the AmlCluster instance.

    If you want Alauda AI to use another OIDC provider, register an OAuth2/OIDC client in that provider, allow the Alauda AI callback URL, and then update spec.oidc in the AmlCluster YAML. The callback URL is:

    https://<platform-address>/clusters/<cluster-name>/aml/oauth2/callback

    Alauda AI reads the OIDC client secret from a Kubernetes Secret in the kubeflow namespace of the Alauda AI installation cluster. The default Secret name is aml-oidc-secret, and the Secret key must be client-secret. Update this Secret with the client secret from your OIDC provider:

    kubectl create secret generic aml-oidc-secret \
      -n kubeflow \
      --from-literal=client-secret='<oidc-client-secret>' \
      --dry-run=client -o yaml | kubectl apply -f -

    Then configure spec.oidc:

    spec:
      oidc:
        # OIDC issuer URL. This must match the issuer value advertised by the
        # provider, for example a Keycloak realm URL.
        issuerURL: https://<oidc-provider-issuer>
        # OAuth2/OIDC client ID registered in the external provider.
        clientID: <oidc-client-id>
        # Kubernetes Secret name in the kubeflow namespace. The Secret must
        # contain a client-secret key. Default: aml-oidc-secret.
        clientSecretName: aml-oidc-secret
        # OAuth2 scopes requested during login. Keep this minimal to avoid
        # oversized ID/access tokens and oversized login cookies.
        # Default: openid profile email.
        scope: openid profile email
        # Whether to use the preferred_username claim as the email value.
        # Default: true.
        preferredUsernameAsEmail: true
        main:
          # OIDC authorization endpoint. If you use discovery, this maps from
          # authorization_endpoint.
          loginURL: https://<oidc-provider-authorization-endpoint>
          # OAuth2 callback URL registered in the external provider.
          redirectURL: https://<platform-address>/clusters/<cluster-name>/aml/oauth2/callback

    If the provider exposes a standard OIDC discovery document at <issuerURL>/.well-known/openid-configuration, Alauda AI automatically fills redeemURL, jwksURL, and profileURL from discovery when these fields are not set. If discovery is unavailable, configure these fields explicitly:

    Discovery fieldspec.oidc field
    authorization_endpointloginURL
    token_endpointredeemURL
    userinfo_endpointprofileURL
    jwks_urijwksURL

    Use the mapped loginURL value for main.loginURL, or for secondary.loginURL if you configure a secondary endpoint.

    spec:
      oidc:
        redeemURL: https://<oidc-provider-token-endpoint>
        jwksURL: https://<oidc-provider-jwks-endpoint>
        profileURL: https://<oidc-provider-userinfo-endpoint>

    Example Keycloak configuration:

    • In the target realm, create an OpenID Connect client.
    • Set Client ID to the value used in spec.oidc.clientID, for example aml.
    • Turn on Client authentication.
    • Under Authentication flow, select Standard flow.
    • Turn on Require PKCE and set PKCE Method to S256.
    • Set Valid redirect URIs to https://<platform-address>/clusters/<cluster-name>/aml/*.
    • Copy the client secret from the Keycloak client Credentials tab and update the aml-oidc-secret Secret shown above.
    • In the client Client scopes settings, set basic, email, and profile to Default, and set other scopes to Optional unless your environment explicitly needs them. Avoid adding large claim mappers such as groups, realm roles, client roles, address, phone, offline access, and other application-specific claims unless required. Large tokens can make the oauth2-proxy cookie exceed browser or ingress header size limits and cause login loops or HTTP 431/400 errors.
  10. (Optional) If you want to enable Knative functionality, update the AmlCluster YAML to reference the KnativeServing instance:

    INFO

    Configure this only if you installed the Knative Operator and created the KnativeServing instance in the previous steps. If you are not using Knative functionality, leave this configuration unset.

    spec:
      components:
        knativeServing:
          externalCRRef:
            apiVersion: operator.knative.dev/v1beta1
            kind: KnativeServing
            name: knative-serving
            namespace: knative-serving
          managementState: Unmanaged
  11. Under Model Catalog section, configure the following parameters:

    • Model OCI Registry Address: Registry address hosting model OCI artifacts for Model Catalog. This field has no default value and must be configured for your environment.

      This registry stores the model OCI images used by Model Catalog. Use Harbor or another production-mode OCI registry with HTTPS access enabled. Model Catalog does not support configuring imagePullSecret for pulling model OCI images, so the Harbor project or repository used for Model Catalog must allow anonymous pull access from inference cluster nodes. In Harbor, set the project that stores Model Catalog images to Public.

      If you cannot deploy a registry with HTTPS in the target environment, you can use an HTTP registry as a fallback. Configure the container runtime on every node in the inference cluster before deploying models. For containerd, add an insecure registry mirror for the registry address, for example by creating /etc/containerd/certs.d/<registry-host:port>/hosts.toml:

      server = "http://<registry-host:port>"
      
      [host."http://<registry-host:port>"]
        capabilities = ["pull", "resolve"]

      Then restart containerd or apply the equivalent node-runtime configuration through your cluster management system. This configuration must exist on the nodes where inference service pods are scheduled; otherwise the pod image pull will fail even if Model Catalog can list the model. The exact containerd configuration path can vary by Kubernetes distribution; after applying the configuration, verify that the node can pull a Model Catalog image, for example with crictl pull <registry-host:port>/<repository>:<tag>.

    • Source of PVC: Choose whether to reuse an existing PVC or create a new one. Use CreateNew to let the installation create the PVC.

    • StorageClass Name: StorageClass used when creating a new PVC.

  12. If you plan to use llm-d or vLLM-ascend, set KServe Modelcar UID to 0. The default value is 1000.

    spec:
      components:
        kserve:
          values:
            kserve:
              storage:
                uidModelcar: 0

    This setting is cluster-level and affects all Modelcar workloads in the Alauda AI installation cluster.

  13. Review the configuration and save the default AmlCluster instance.

Verification

Check the status field from the AmlCluster resource named default:

kubectl get amlcluster default

It should return Ready:

NAME      READY   REASON
default   True    Succeeded

Importing Built-in Model Images for Catalog

The Catalog feature in Alauda AI ships with a set of built-in model OCI images that users can deploy as inference services from the Web Console. These images must be imported into the OCI registry configured by Model Catalog before the Catalog can serve them. Without this step, the installation completes successfully, but deploying a built-in model from the Catalog will later fail with ImagePullBackOff.

Obtaining the OCI image tarballs

Built-in model images are delivered as OCI archive tarballs (.tar files compliant with the OCI Image Layout Specification). Each tarball contains a multi-architecture image (linux/amd64 + linux/arm64) for one model.

Download the tarballs from the Customer Portal Marketplace, or contact your Alauda support representative to obtain the package matching your Alauda AI version.

Pushing to Harbor

The recommended target is Harbor. The example below uses an HTTP Harbor registry. If your Harbor registry uses HTTPS, omit --plain-http and change the API URLs from http:// to https://.

Run the commands on a node that has ctr, curl, and jq installed and can reach Harbor.

First, set the environment variables:

export REG=192.168.140.0:32700                
export REPO=mlops/modelcar-qwen3.5-0.8b       
export TAG=v0.1.0                             
export TAR=./Qwen3.5-0.8B.oci.tar             
export AUTH='user:password'
  1. Harbor registry endpoint, without the URL scheme.
  2. Target repository path in Harbor, in the form <project>/<image-name>. For example, mlops/modelcar-qwen3.5-0.8b uses the Harbor project mlops and repository modelcar-qwen3.5-0.8b.
  3. Image tag carried by the OCI archive. If you do not know it, extract it from the tarball with the command below.
  4. Path to the OCI archive tarball obtained in the previous step.
  5. Harbor credentials in the form user:password. Contact your platform administrator if you do not have these.

The tarball usually carries its own tag (e.g. v0.1.0) inside the OCI image layout. If needed, extract it from the tarball:

export TAG=$(tar -xOf "$TAR" index.json \
  | jq -r '.manifests[0].annotations["org.opencontainers.image.ref.name"]')
echo "$TAG"   # should print something like v0.1.0

Check whether the image tag already exists in Harbor:

URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG"

HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL")

echo "HTTP=$HTTP  URL=$URL"

[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \
  || jq -r '.errors[]?.message' /tmp/harbor-artifact.json

If the Harbor project does not exist yet, create it before pushing:

PROJECT="${REPO%%/*}"

curl -s -u "$AUTH" -X POST "http://$REG/api/v2.0/projects" \
  -H 'Content-Type: application/json' \
  -d "{\"project_name\":\"$PROJECT\",\"public\":true}" \
  -w '\nHTTP %{http_code}\n'

If the project already exists, Harbor returns a non-2xx status code. After confirming the project exists, make sure it is configured as a public project, then continue with the import and push. Model Catalog does not support configuring imagePullSecret when deploying model OCI images, so inference cluster nodes must be able to pull these images anonymously.

Then run the import and push procedure:

# 1. Import into the node's containerd content store.
#    --base-name prepends $REG/$REPO to the tag carried inside the tarball,
#    producing a fully-qualified reference $REG/$REPO:$TAG.
ctr -n k8s.io images import \
  --all-platforms \
  --base-name "$REG/$REPO" \
  "$TAR"

# 2. Verify the import. You should see "$REG/$REPO:$TAG".
ctr -n k8s.io images ls -q | grep "$REPO"

# 3. Push to Harbor. Use --plain-http only for HTTP Harbor.
ctr -n k8s.io images push \
  --plain-http \
  --user "$AUTH" \
  "$REG/$REPO:$TAG"

# 4. Clean up the local reference on the node. Blob data is reclaimed by
#    containerd's garbage collector, leaving no persistent state on the node.
ctr -n k8s.io images rm "$REG/$REPO:$TAG"

Repeat this procedure for each built-in model tarball, varying $REPO, $TAG, and $TAR per model.

INFO

--all-platforms is critical at the import step: omitting it imports only the node's host architecture, and the subsequent push will silently miss the other platform's blobs. The flag is not needed on push — pushing the multi-arch index automatically pushes all platforms it references.

Verifying the Harbor import

Confirm that Harbor now serves the image:

URL="http://$REG/api/v2.0/projects/${REPO%%/*}/repositories/$(printf '%s' "${REPO#*/}" | sed 's|/|%2F|g')/artifacts/$TAG"

HTTP=$(curl -s -u "$AUTH" -o /tmp/harbor-artifact.json -w '%{http_code}' "$URL")

echo "HTTP=$HTTP  URL=$URL"

[ "$HTTP" = 200 ] && jq '{digest, size, push_time, arch: .extra_attrs.architecture, tags: [.tags[].name], platforms: [.references[]?.platform]}' /tmp/harbor-artifact.json \
  || jq -r '.errors[]?.message' /tmp/harbor-artifact.json

HTTP=200 means the image was successfully imported into Harbor. Expected output includes the digest, size, push time, tag, and platform information:

{
  "digest": "sha256:...",
  "size": 123456789,
  "push_time": "2026-05-06T00:00:00.000Z",
  "arch": "amd64",
  "tags": ["v0.1.0"],
  "platforms": [
    {"architecture": "amd64", "os": "linux"},
    {"architecture": "arm64", "os": "linux"}
  ]
}

Now, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the Quick Start.

FAQ

1. Configure the audit output directory for aml-skipper

The default audit output path is /cpaas/audit on the host. However, on some operating systems (e.g., MicroOS), the root path of the host is read-only, and the /cpaas directory cannot be created. In this case, users need to modify the audit output path.

To modify the audit output path, update the AmlCluster default resource and add the amlSkipper.auditLogHostPath.path configuration under spec.values. For example:

apiVersion: amlclusters.aml.dev/v1alpha2
kind: AmlCluster
metadata:
  name: default
  ...
spec:
  ...
  values:
    amlSkipper:
      auditLogHostPath:
        path: /var/lib/audit
NOTE

The specific path should be consistent with the collection configuration of Alauda Container Platform Log Collector.